This pipeline analyses paired-end shotgun metagenomics data. In a first step is trims adapters and low-quality bases by using Skewer. Reads are then decontaminated against reference specified in the config file (default: human). Then, several metagenomics profilers (default: Kraken and Metaphlan2) and a pathway analysis with HUMAnN2 is performed. Results aggregated over all input samples. In addition resistance gene typing results are produced with SRST2.
This pipeline was originally developed by Chenhao Li (see Github repository). The SRST2 logic was added by KOH Jia Yu (Jayce).
The following lists the main output files
{sample}/reads/all-trimmed-decont_[12].fastq.gz
is the (if needed concatenated,) quality trimmed and decontaminated read pair{sample}/reads/counts.txt
: read counts after trimming and after decontamination{sample}/srst2/{sample}__genes__{db}__results.txt
and{sample}/srst2/{sample}__fullgenes__{db}__results.txt
: SRST2 resistance gene typing results (see SRST2 documentation)merged_table_{profiler}/{tax}.{profiler}.profile_merged.tsv
whereprofiler
can bekraken
ormetaphlan2
andtax
is a one-letter abbreviation for taxonomic rank (e.g.g
for genus). Please note that Metaphlan2 lists abundances as percentage, whereas Kraken produces read countsmerged_table_humann2/genefamily.tsv
: Abundance (in RPK) of each gene family (each row) in each sample (each column)merged_table_humann2/pathabundance.tsv
: Abundance of each pathway in each sample. The pathways are stratified into species.merged_table_humann2/pathcoverage.tsv
: Presence (1)/absence (0) code for each pathway in each sample.
- Skewer: Publication and website
- Kraken: Publication and website
- Metaphlan2: Publication and website
- HUMAnN2: Website
- Decont: Website