A Snakemake workflow for assessing detection limit from laser-microdissected samples.
-
Requirements
-
Clone the repository Clone the repository, and set it as the working directory.
git clone --recursive https://github.com/3d-omics/mg_quant.git
cd mg_quant
- Run the pipeline with the test data (takes 5 minutes to download the required software)
snakemake \
--use-conda \
--conda-frontend mamba \
--jobs 8
-
Edit the following files:
-
config/samples.tsv
: the control file with the sequencing libraries and their location.sample_id library_id forward_filename reverse_filename forward_adapter reverse_adapter sample1 lib1 resources/reads/sample1_1.fq.gz resources/reads/sample1_2.fq.gz AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT sample2 lib1 resources/reads/sample2_1.fq.gz resources/reads/sample2_2.fq.gz AGATCGGAAGAGCACACGTCTGAACTCCAGTCA AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
-
config/features.yml
: the references and databases against which to screen the libraries: hosts and MAG catalogues.references: # Reads will be mapped sequentially human: resources/reference/human_22_sub.fa.gz chicken: resources/reference/chicken_39_sub.fa.gz mag_catalogues: mag1: resources/reference/mags_sub.fa.gz # mag2: resources/reference/mags_sub.fa.gz databases: kraken2: mock1: resources/databases/kraken2/kraken2_RefSeqV205_Complete_500GB # refseq500: resources/databases/kraken2/kraken2_RefSeqV205_Complete_500GB singlem: resources/databases/singlem/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb
-
config/params.yml
: parameters for every program. The defaults are reasonable.
-
-
Run the pipeline and go for a walk:
snakemake --use-conda --profile profile/default --jobs 100 --cores 24 `#--executor slurm`
- Trim reads and remove adaptors with
fastp
- Map to human, chicken / pig, mag catalogue:
- Map to the reference with
bowtie2
- Extract the reads that have one of both ends unmapped with
samtools
- Map those unmapped reads to the next reference
- Map to the reference with
- Generate MAG-based statistics with
coverm
- Generate MAG-independent statistics with
singlem
andnonpareil
- Assign taxonomically reads with
kraken2
- Generate lots of reports in the
reports/
folder