VIRGO2

Overview

VIRGO2 is a non-redudant catalog of genes from the human vaginal microbiome that allows for rapid taxonomic and functional analysis of metagenomic/metatranscriptomic reads. VIRGO2 is distributed as a bowtie2 index so that processed/QC'd reads can be mapped using the bowtie2 short read mapping tool. Annotation tables can be merged with mapping results to provide taxonomic and functional information.

Dependencies

VIRGO2 requires the following software to be install and in the PATH

-git lfs -Gzip -Bowtie2 -Samtools -Python3 >v3.8

and the following python packages

-pandas -numpy

Installation

VIRGO2 can be install by cloning the repository with git lfs installed and configured and then unzipping the annotation and fasta files, and building the bowtie2 index from the provided fasta file. Once the repository is cloned, you should run the VIRGO2.py install command which will unzip the required files and build the bowtie2 index.

git clone https://github.com/ravel-lab/VIRGO2.git

cd VIRGO2

python3 VIRGO2.py install

NOTE: installation will fail if 'git lfs' is not installed and configured. If you have already cloned the repository and need to add the large files after install and configuring 'git lfs', this can be accomplished by running the follow command in the repository directory.

git lfs pull

Alternative source

VIRGO2 can also be obtained from zenodo under the DOI: 10.5281/zenodo.18703182 (https://zenodo.org/records/18703182). If you obtain VIRGO2 from zenodo, you will need to decompress the archived directories (FastaFiles.tar.gz, AnnotationTables.tar.gz, Index.tar.gz, AccessoryScripts.tar.gz) prior to installation.

Usage

The main VIRGO2 operations are performed by the script VIRGO2.py that has the commands VIRGO2.py map , VIRGO2.py compile , and VIRGO2.py taxonomy. The map command should be run on all samples individually. The compile command concatenates the results from the individual samples into one table containing genes are rows and samples as columns. The taxonomy command operates on the file generated by the compile command and produces tables containing the estimated composition of each sample.

> python VIRGO2.py -h
    usage: VIRGO2.py [-h] {install,map,compile,taxonomy,license} ...

    VIRGO2 is a tool and associated database used to analyze vaginal shotgun metagenomes and metatranscriptomes

    positional arguments:
    {install,map,compile,taxonomy,license}

    optional arguments:
    -h, --help            show this help message and exit

`VIRGO2.py map`

This module will map the sequencing reads from a single sample to the VIRGO2 bowtie2 index. Suggested usage is with the default settings.

> python VIRGO2.py map -h
    usage: VIRGO2.py map [-h] -r READS [-c {0,1}] [-p THREADS] -o OUTPUTPREFIX [-b {0,1}]

    optional arguments:
    -h, --help              Show this help message and exit
    -r READS, --reads READS
                            Single-End reads file, can be gzipped
    -c {0,1}, --cov {0,1}
                            Assign multi-mapped reads to gene with highest percent covered, 0=No,1=Yes, default:Yes
    -p THREADS, --threads THREADS
                            Number of threads used in mapping default:1
    -o OUTPUTPREFIX, --outputPrefix OUTPUTPREFIX
                            Prefix used in the filename for the mapping output
    -b {0,1}, --bypass {0,1}
                            Sam and coverage files already generated, proceed to coverage correction directly 0=No, 1=Yes, default No

`VIRGO2.py compile`

After all samples have been mapped to VIRGO2, the compile command will merge the mapping results from all sample to a single file to be used in downstream analysis.

> python VIRGO2.py compile -h
    usage: virgo2_ic.py compile [-h] [-i INPUT] [-o OUTPUTPREFIX]

    optional arguments:
    -h, --help              show this help message and exit
    -i INPUT, --input INPUT
                            Directory where bowtie2 mapping results are located
    -o OUTPUTPREFIX, --outputPrefix OUTPUTPREFIX
                            Prefix used in the filename for the compiled output

After running compile, there will be a single tab-delimmted file with the read counts per gene per sample.

`VIRGO2.py taxonomy`

After the compiled output has been produced, the compile command can be used to estimate the taxonomic composition of each sample. Composition can be reported including only bacteria in the calculation and can be output as read counts or relative abundances.

Default settings apply a per species gene-detection number threshold that a species must meet in order to be considered present in a sample. These thresholds are located in the file (2.VIRGO2.taxonThresholds.txt) and are set to 20% of the median number of genes detected for the species in the dataset used to build VIRGO2 (e.g. the median number of L. crispatus genes in metagenome containing L. crispatus was 2809, so the threshold for L. crispatus is set at 561). The detection threshold can be disabled.

>python VIRGO2.py taxonomy -h
    usage: VIRGO2.py taxonomy [-h] [-i INPUT] [-o OUTPUTPREFIX] [-b {0,1}] [-f {0,1}] [-r {0,1}] [-m {0,1}]

    optional arguments:
      -h, --help            show this help message and exit
      -i INPUT, --input INPUT
                            Full path to compiled results file
      -o OUTPUTPREFIX, --outputPrefix OUTPUTPREFIX
                            Prefix used in the filename for the compiled output
      -b {0,1}, --bacteria {0,1}
                            Report composition including only the bacteria, default=1
      -f {0,1}, --filter {0,1}
                            Mask contribution from taxa where number of genes detected is below threshold, default=1
      -r {0,1}, --readCounts {0,1}
                            Report values as read counts instead of relative abundances, default=0
      -m {0,1}, --multigenera {0,1}
                            Report relative abundance of multigenera genes, off by default (0)

After running taxonomy, there will be a single comma-separated file with the taxonomy composition per sample.

Annotation files and additional analyses

VIRGO2 contains the following gene annotation files that can be merged with the compiled mapped results using standard functions in python or R. An example script can be found in AccessoryScripts/VIRGO2_add_annotations.py.

-0.VIRGO2.geneLength.txt        :Length of each VIRGO2 gene
-1.VIRGO2.taxon.txt             :Taxonomic annotation of each VIRGO2 gene
-2.VIRGO2.taxonThresholds.txt   :Per taxon gene number thresholds used in estimated relative abundance
-3.VIRGO2.eggNog.txt            :EggNog Annotations per VIRGO2 gene
-4.VIRGO2.PFAM.txt              :PFAM Annotations per VIRGO2 gene
-5.VIRGO2.EC.txt                :Enzyme Commision numbers per VIRGO2 gene
-6.VIRGO2.geneProduct.txt       :Gene product Annotations per VIRGO2 gene
-7.VIRGO2.kegg.txt              :KEGG annotations per VIRGO2 gene
-8.VIRGO2.CAZy.txt              :Carbohydrate-active enzmye annotations per VIRGO2 gene
-9.VIRGO2.AMR.txt               :Antimicrobial resistance annotations per VIRGO2 gene
-10.VIRGO2.phage.txt            :Bacteriophage annotations per VIRGO2 gene
-11.VIRGO2.compound.txt         :Biosynthetic gene cluster annotations per VIRGO2 gene
-VIRGO2_VOGkey.txt              :Per-species orthologous gene cluster assignment for VIRGO2 genes

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
AccessoryScripts		AccessoryScripts
AnnotationTables		AnnotationTables
FastaFiles		FastaFiles
Index		Index
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
VIRGO2.py		VIRGO2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VIRGO2

Overview

Dependencies

Installation

Alternative source

Usage

`VIRGO2.py map`

`VIRGO2.py compile`

`VIRGO2.py taxonomy`

Annotation files and additional analyses

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

VIRGO2

Overview

Dependencies

Installation

Alternative source

Usage

VIRGO2.py map

VIRGO2.py compile

VIRGO2.py taxonomy

Annotation files and additional analyses

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 1

Languages

`VIRGO2.py map`

`VIRGO2.py compile`

`VIRGO2.py taxonomy`

Packages