Skip to content
sergeitarasov edited this page Oct 28, 2017 · 40 revisions

ontoFAST: Manual

Table of Contents

Installation

To install and run ontoFAST efficiently you need to have R, RStudio, and devtools package installed. As soon as R and RStudio are installed, you can install devtools by executing the following code from within RStudio.

# !!! uncomment to run
# install.packages("devtools")
# library("devtools")

Now, the package ontoFAST can be directly installed from the ontoFAST repository at GitHub.

# !!! uncomment to run
# install_github("sergeitarasov/ontoFAST")
 library(ontoFAST)

The workflow

The workflow of ontoFAST consists of the following steps:

  1. Read in required ontology and character statements:
    • Ontology can be in .obo file format or any R file format (e.g., .rmd, rda). The ontology .obo files can be downloaded from BioPortal or other repositories.
    • So far, ontoFAST is designed to annotate character statements only (not character states). Character statements must be in the table file format, The tables can imported in R as .csv files. If you have your character matrix in nexus or tnt format open it in Mesquite and copy character statements to e.g., Excel spreadsheet, then save the spreadsheet as a .csv file. The example of the table file format for character statements (including "states" is optional):
library(knitr)
  char_et_states<-read.csv(system.file("data_onto", "Sharkey_2011.csv", package = "ontoFAST"), header=T,  stringsAsFactors = F, na.strings = "")
kable(head(char_et_states[1:3,1:3]), row.names = T,  format = "markdown")
  1. Run automatic annotation of characters with ontology terms using ontoFAST functions. This step is optional and can be skipped if not required.
  2. Run ontoFAST interactively to make de novo character annotations, post-process automatic annotations or edit previous annotations. The interactive mode visualizes the ontology as a network thus providing a convenient way to navigate through it.
  3. As soon as annotations are done you can:
    • visualize annotations using the sunburst plots or export them to Cytoscape to visualize the network structure.
    • query annotations using the in-built ontoFAST functions.
  4. Save your results.

Step 1 and 2: reading and processing data

First, we need to read in the required ontology and character statements. This can be done either (a) using in-built functions to process ontology and characters or (b) manually creating an ontology object. The latter can be useful if a finer tuning of information stored in ontology is needed.

(a) Quick way to process data

Let's first read in ontology. In this example, I use Hymenoptera anatomy ontology that is available as embedded data set (HAO):

hao_obo<-HAO

Alternatively ontology can be parsed directly from .obo file using get_OBO function from ontologyIndex package.

hao_obo=get_OBO(system.file("data_onto", "HAO.obo", package = "ontoFAST"),
       extract_tags="everything", propagate_relationships = c("BFO:0000050", "is_a"))

The ontology object (i.e., hao_obo) is a list with numerous subsists that contain information of ontology terms and their relationships.

Now, let's read in character statements. The character statements stored in the table format can be imported to R using read.csv function. Here, I use embedded data set from morphological phylogeny of Sharkey et.al. (2011). This data sets contains 392 characters. It also available as both embedded R data set and a .csv file.

# R embedded data set
Sharkey_characters<-Sharkey_2011

# csv file
Sharkey_characters<-read.csv(system.file("data_onto", "Sharkey_2011.csv", 
      package = "ontoFAST"), header=T,  stringsAsFactors = F, na.strings = "")

To automatically process the data use onto_process function. This function automatically parses synonyms, character statements, and character IDs into hao_obo object. By default, it also performs automatic annotation of character statements with ontology terms.

hao_obo<-onto_process(hao_obo, Sharkey_characters[,1], do.annot = T)

The automatic processing of data is done. Now, you can proceed to the interactive editing of your data. See the Step 3.

(b) Manual way to process data

Manual processing assumes that all elements of ontology object are created manually. See the structure of ontology object here.

First, let's read ontology and characters

hao_obo<-HAO
Sharkey_characters<-Sharkey_2011

Let's create IDs which will be used to call the characters. All data have to be placed in ontology object (here hao_obo).

id_characters<-paste("CHAR:",c(1:392), sep="")
hao_obo$id_characters<-id_characters

Now, let's add character statements to hao_obo and associate them with IDs.

name_characters<-Sharkey_characters[,1]
names(name_characters)<-id_characters
hao_obo$name_characters<-name_characters

To make automatic annotation more efficient, we can use the synonyms of ontology terms. The synonyms are stored in hao_obo$synonym; they have to be pre-possessed to be available for automatic annotation. To do it, we use syn_extract() function.

hao_obo$parsed_synonyms<-syn_extract(hao_obo)

Now, we can run the automatic annotation.

hao_obo$auto_annot_characters<-annot_all_chars(hao_obo, use.synonyms=TRUE, min_set=TRUE)

The manula processing is done, proceed to the interactive vizualization (Step 3).

Step 3: running ontoFAST interactively

To run ontoFAST interactively, we have to create a global shiny_in object that will keep track on the annotations and associated data. To create shiny_in object use make_shiny_in function, and the global assignment operator <<-.

shiny_in<<-make_shiny_in(hao_obo)

Now we can run ontoFAST interactively by executing the line below. It may take a few seconds until all characters are loaded. The function runOntoFast() automatically takes in shiny_in object to display annotations and ontology. All changes made during the interactive session are immediately saved in shiny_in.

# !!! uncomment to run
# runOntoFast()

By default ontoFAST displays all the characters in the data set. Using argument nchar=N you my restrict the visualization to N characters. You may also use ontoFAST as an ontology browser without loading characters by specifying show.chars=FALSE.

Here is a quick guide to the interactive mode of the ontoFAST.

minipic

Step 4: visualize and query annotations

Having you characters annotated with ontology terms you can proceed to the next step - visualizing and querying your annotations.

Sunburst plots

The hierarchical tree-like ontological dependencies among characters can be visualized using sunburst plot. This plot shows hierarchy through a series of rings. Each ring corresponds to a level in the ontological hierarchy, with the inner circles representing the root nodes and outermost circles representing character statements. To do this in R, you need to have sunburstR package intalled.

# install.packages("sunburstR")
 library("sunburstR")

To have an interpretable and clear visualization, I suggest using either part_of or is_a relationships as they have a tree-like hierarchy. Using both relationships simultaneously can be messy. Let's read in HAO ontology and propagate only part_of relationships. The ID for the part_of relationships in HAO is "BFO:0000050". To use is_a relationships, change the argument propagate_relationships to "is_a".

ontology_partof=get_OBO(system.file("data_onto", "HAO.obo", package = "ontoFAST"),
              extract_tags="everything", propagate_relationships = c("BFO:0000050"))

Now, I quickly process ontology to incorporate character statements, without using automatic characters annotation. Next, I incorporate the embedded manual annotations stored in Sharkey_2011_annot data object as a list annot_characters of ontology_partof object.

ontology_partof<-onto_process(ontology_partof, Sharkey_2011[,1], do.annot = F)
ontology_partof$annot_characters<-Sharkey_2011_annot

The input for sunburst plot can be created using paths_sunburst function. You may consider excluding some high-level terms to make visualization clearer by specifying exclude.terms = exclude_terms.

tb<-paths_sunburst(ontology_partof, annotations = ontology_partof$annot_characters, 
                   exclude.terms = exlude_terms, sep = "-")

The data are now ready for visualization. Use sunburst function form sunburstR to visualize them. The visualization in R or a browser is interactive - check it out by placing mouse over.

sunburst(tb)

The sunburst plot of HAO terms and morphological characters sunburst plot

Cytoscape.

Yo may consider using Cytoscape to get insight into the complex network of ontology terms and characters. To do it, export the annotations into Cytoscape format using export_cytoscape function and save the exported object as a csv file.

ontology<-HAO
# processing ontology to incorporate character statements
ontology<-onto_process(ontology, Sharkey_2011[,1], do.annot = F)
# embedding manual annotations
ontology$annot_characters<-Sharkey_2011_annot

# exporting
cyto<-export_cytoscape(ontology, annotations = ontology$annot_characters, is_a = c("is_a"),
  part_of = c("BFO:0000050"))
write.csv(cyto, file="HAO_chars.csv")

To import the saved file to Cytoscape open Cytoscape and choose File -> Import -> Network -> "HAO_chars.csv".

Query linked characters

  1. For each term you can get a number of characters which are descendants of the term.
chars_per_term(ontology, annotations = ontology$annot_characters) %>% head()
  1. Get ancestral ontology terms for a set of characters.
get_ancestors_chars(ontology, c("CHAR:1", "CHAR:2", "CHAR:3"), 
                    annotations = ontology$annot_characters)
  1. Get characters which are descendants of a particular ontology term.
get_descendants_chars(ontology, annotations = ontology$annot_characters, terms="HAO:0000653")

Step 5: save your data

The convenient way to save all data is to save the ontology object using native R format .Rdata. For example, if you did manual and automatic annotations you can save shiny_in object in .Rdata. The Rdata format will save all information stored in shiny_in.

save(shiny_in, file="shiny_in.Rdata")
# to load file in R
#load(file="shiny_in.Rdata")

You can also export your annotation and characters in a readable csv table.

# exporting annotations
annot_csv<-export_annotations(ontology, annotations = ontology$annot_characters, 
                incl.names = T, sep.head = ", ", sep.tail = NULL, collapse = NULL) 
head(annot_csv)

Tune the format of the table using arguments of export_annotations function.

annot_csv<-export_annotations(ontology, annotations = ontology$annot_characters, incl.names = T,
  sep.head = ", (", sep.tail = ")", collapse = ";") 
head(annot_csv)

# save file
write.csv(annot_csv, file="Annot_csv.csv")
Clone this wiki locally