A Snakemake workflow for mt_quant
:
- Preprocessing:
- Trim reads with
fastp
- Remove rRNAs with
ribodetector
- Remove host RNA with
STAR
- Screen your reads with
kraken2
- Trim reads with
- Quantification:
- Map reads to a MAG catalogue with
bowtie2
- Get count tables with
CoverM
- Map reads to a MAG catalogue with
- Report
- Get a gazillion of reports with
samtools
,fastqc
andmultiqc
- Get a gazillion of reports with
-
Make sure you have
conda
,mamba
andsnakemake
installed.conda --version mamba --version snakemake --version
-
Clone this git repository and get it
git clone https://github.com/3d-omics/mt_quant cd mt_quant
-
Test your installation by running the pipeline with test data. It will download all the necessary software through conda / mamba. It should take less than five minutes.
./run
-
Run it with your own data:
-
Edit
config/samples.tsv
and add your sample names, a library identifier in case you have more than one file per sample, their paths and adapters used.sample_id library_id forward_filename reverse_filename forward_adapter reverse_adapter sample1 1 resources/reads/GBRF1.1_1.fq.gz resources/reads/GBRF1.1_2.fq.gz AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT sample2 1 resources/reads/GBRM1.1_1.fq.gz resources/reads/GBRM1.1_2.fq.gz AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
-
Edit
config/features.yml
with your reference hosts, mags and external databases. You can have multiple hosts and multiple catalogues. You can even have no host files in case you are analyzing environmental samples.hosts: # Comment the next lines of no host human: genome: resources/reference/chrX_sub.fa.gz gtf: resources/reference/chrX_sub.gtf.gz mag_catalogues: mag1: resources/reference/mags_mock.fa.gz # mag2: resources/reference/mags_mock.fa.gz databases: kraken2: # Comment the next lines if no database mock: resources/databases/kraken2/kraken_mock # mock2: resources/databases/kraken2/kraken_mock
-
Edit
config/params.yml
with the execution parameters. The defaults are reasonable.
-
-
Run the pipeline
./run -j8 # locally with 8 cpus ./run_slurm # on a cluster with slurm