You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/paper.md
+12-13Lines changed: 12 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ The Gene Ontology (GO) [@Ashburner2000; @GO2023] is a structured vocabulary tha
27
27
28
28
For example, in the GO data, `GO:0090630` defines *activation of GTPase activity* and is a child of `GO:0043547`, defined as *positive regulation of GTPase activity* which in turn is a child of `GO:0051345` representing *positive regulation of hydrolase activity*.
29
29
30
-
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating a function to a gene product. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols with over 18,680 GO terms.
30
+
Gene association files (GAF) are text files used to annotate an organism's gene products with Gene Ontology terms, associating functions to gene products. For example, a GAF file connects a gene product label, such as `ZC3H11B`, with multiple GO terms, such as `GO:0046872` or `GO:0016973`. The complete human genome GAF representation contains 288,575 associations of 19,606 gene symbols with over 18,680 GO terms.
31
31
32
32
The [Gene Ontology Consortium][GO] maintains GAF files for various organisms. Typical genomic analysis protocols generate gene lists that must be placed in a functional context.
33
33
@@ -39,15 +39,15 @@ The most annotated gene in the human genome, `HTT`, currently has 1100 annotatio
39
39
40
40
Web-based tools designed to visualize and filter gene ontology data include `AmiGO`[@AmiGO] and `QuickGO`[@QuickGO]. Command line tools like `goatools`[@goatools] support GO term lineage visualization. R packages like `topGO`[@topGO] implement GO structure visualizations of enriched GO terms. We are unaware of locally installable software that specifically allows for interactive filtering and visualization of gene ontology derived on gene lists.
41
41
42
-
GeneScape is a Python package that allows users to visualize a list of gene products in terms of the functional context represented by the Gene Ontology.
42
+
GeneScape is a Python package that allows users to visualize a list of genes in the functional context represented by the Gene Ontology
43
43
44
-
GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software via the [Shiny platform][shiny][@shiny], thus making it accessible to a wide range of users.
44
+
GeneScape is distributed both as a command-line tool and as GUI-enabled standalone software via the [Shiny platform][shiny][@shiny], making it accessible to a wide range of users.
45
45
46
46

47
47
48
48
[shiny]: https://shiny.posit.co/
49
49
50
-
GeneScape comes with a number of prebuilt databases for model organisms including the human, mouse, rat, fruitfly and zebrafish genomes. To study additional organisms, users must download GAF files from the Gene Ontology website and create custom databases using the `build` subcommand:
50
+
GeneScape is distributed with several prebuilt databases for model organisms including the human, mouse, rat, fruitfly and zebrafish genomes. To study additional organisms, users must download GAF files from the Gene Ontology website and create custom databases using the `build` subcommand:
In the next step, GeneScape draws the GO terms as the graph structure using the Networkx package [@networkx] helping users visualize the functional context of the genes relative to the larger Gene Ontology.
103
+
In the next step, GeneScape draws the GO terms as the graph structure using the Networkx package [@networkx], helping users visualize the functional context of the genes relative to the larger Gene Ontology.
104
104
105
-
Various colors and labels are used to provide additional context to the nodes in the graph; for example, functions present in the input genes are colored green. The intermediate nodes are colored by their category. Node labels display the total annotations and the number of genes that carry that function.
105
+
Various colors and labels are used to provide additional context to the nodes in the graph; for example, functions present in the input genes are colored green. Intermediate nodes are colored by their category. Node labels display the total annotations and the number of genes that carry that function.
106
106
107
107
![Filtering a large graph for a specific term \label{fig:help}][img_help]
108
108
109
109
[img_help]: images/node_help_1.png
110
110
111
111
In the web interface, users can zoom in and out of the tree. The software's command-line version supports generating outputs in various formats, such as PDF or PNG.
112
112
113
-
Since the resulting graphs may also be large, with thousands of nodes, the main interface provides input widgets that allow users to interactively
114
-
reduce the subgraph to nodes for which:
113
+
Since the resulting graphs may also be large, with thousands of nodes, the main interface provides input widgets that allow users to interactively reduce the subgraph to nodes for which:
115
114
116
115
1. The function definitions match certain patterns.
117
116
2. A minimum number of genes share a function.
118
117
3. Nodes belong to a specific GO subtree: Biological Process (BP), Molecular Function (MF), Cellular Component (CC).
119
118
120
-
As an example, take the input genelist of just four genes:
119
+
As an example, take the input gene list of just four genes:
121
120
122
121
```
123
122
Cyp1a1
@@ -126,7 +125,7 @@ Sptlc2
126
125
Smpd3
127
126
```
128
127
129
-
the resulting functional ontology graph is large with 641 nodes and 1007 edges:
128
+
the resulting functional ontology graph is large with 641 nodes and 1,007 edges:
130
129
131
130
![Very few genes can produce a large ontology tree \label{fig:huge}][img_bigtree]
132
131
@@ -138,14 +137,14 @@ Users can reduce the tree to show only terms that match the word `lipid` and wit
138
137
genescape tree -m lipid --micov 2 genes2.txt -o output.pdf
139
138
```
140
139
141
-
The filtering process will result in a smaller tree with 18 nodes and 29 edges focused on the functions that contain the word "lipid":
140
+
The filtering process will result in a smaller tree with 18 nodes and 29 edges, focused on the functions that contain the word "lipid":
142
141
143
142
![Filtering a large graph for a specific term \label{fig:filter}][img_filter]{height="216pt"}
144
143
145
144
[img_filter]: images/gs_output_3.png
146
145
147
146
148
-
The software's primary purpose is to allow users to assess the functional depth of genes and to identify commonalities and differences in the functional context of these genes.
147
+
The software's primary purpose is to allow users to assess the functional depth of genes and identify commonalities and differences in the functional context of these genes.
0 commit comments