---
title: "owlents: using OWL directly in ontoProc"
author: "Vincent J. Carey, stvjc at channing.harvard.edu"
date: "`r format(Sys.time(), '%B %d, %Y')`"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{owlents: using OWL directly in ontoProc}
  %\VignetteEncoding{UTF-8}
bibliography: ontobib.bib
output:
  BiocStyle::html_document:
    highlight: pygments
    number_sections: yes
    theme: united
    toc: yes
---

# Introduction

In Bioconductor 3.19, ontoProc can work with OWL RDF/XML
serializations of ontologies, via the 
[owlready2](https://owlready2.readthedocs.io/en/v0.42/) python modules.

The `owl2cache` function retrieves OWL from a URL or file
and places it in a cache to avoid repetitious retrievals.  The
default cache is the one defined by `BiocFileCache::BiocFileCache()`.
Here we work with the cell ontology.  `setup_entities` will use
owlready2 python modules to parse the OWL and produce an
instance of S3 class `owlents`.

```{r getcl,message=FALSE}
library(ontoProc)
ontoProc:::install_owlready2()
clont_path = owl2cache(url="http://purl.obolibrary.org/obo/cl.owl")
cle = setup_entities(clont_path)
cle
```

A plot method is available.  Given a vector of tags
as reported in OWL (no colons are used), the plot method produces
an ontologyIndex instance and runs onto_plot2
on the result.

```{r lkcl}
sel = c("CL_0000492", "CL_0001054", "CL_0000236", 
"CL_0000625", "CL_0000576", 
"CL_0000623", "CL_0000451", "CL_0000556")
plot(cle, sel)
```

# Illustration with Human Phenotype ontology

We'll obtain and ad hoc selection of
15 UBERON term names and visualize
the hierarchy.

```{r gethp}
ontoProc:::install_owlready2()
hpont_path = owl2cache(url="http://purl.obolibrary.org/obo/hp.owl")
hpents = setup_entities(hpont_path)
kp = grep("UBER", hpents$clnames, value=TRUE)[21:35]
plot(hpents, kp)
```

The prefixes of class names in the ontology
give a sense of its scope.
```{r lkta}
t(t(table(sapply(strsplit(hpents$clnames, "_"), "[", 1))))
```

To characterize human phenotypes ontologically, 
CL, GO, CHEBI, and
UBERON play significant roles.