Snakemake workflow: `mg_quant`

A Snakemake workflow for assessing detection limit from laser-microdissected samples.

Usage

Requirements
1. miniconda / mamba
2. snakemake
Clone the repository Clone the repository, and set it as the working directory.

git clone --recursive https://github.com/3d-omics/mg_quant.git
cd mg_quant

Run the pipeline with the test data (takes 5 minutes to download the required software)

snakemake \
    --use-conda \
    --conda-frontend mamba \
    --jobs 8

Edit the following files:

config/samples.tsv: the control file with the sequencing libraries and their location.

sample_id	library_id	forward_filename	reverse_filename	forward_adapter	reverse_adapter
sample1	lib1	resources/reads/sample1_1.fq.gz	resources/reads/sample1_2.fq.gz	AGATCGGAAGAGCACACGTCTGAACTCCAGTCA	AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
sample2	lib1	resources/reads/sample2_1.fq.gz	resources/reads/sample2_2.fq.gz	AGATCGGAAGAGCACACGTCTGAACTCCAGTCA	AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT

config/features.yml: the references and databases against which to screen the libraries: hosts and MAG catalogues.

references:  # Reads will be mapped sequentially
   human: resources/reference/human_22_sub.fa.gz
   chicken: resources/reference/chicken_39_sub.fa.gz

mag_catalogues:
   mag1: resources/reference/mags_sub.fa.gz
   # mag2: resources/reference/mags_sub.fa.gz

databases:
   kraken2:
      mock1: resources/databases/kraken2/kraken2_RefSeqV205_Complete_500GB
      # refseq500: resources/databases/kraken2/kraken2_RefSeqV205_Complete_500GB
   singlem: resources/databases/singlem/S3.2.1.GTDB_r214.metapackage_20231006.smpkg.zb

config/params.yml: parameters for every program. The defaults are reasonable.

Run the pipeline and go for a walk:

snakemake --use-conda --profile profile/default --jobs 100 --cores 24 `#--executor slurm`

Rulegraph

Brief description

Trim reads and remove adaptors with fastp
Map to human, chicken / pig, mag catalogue:
1. Map to the reference with bowtie2
2. Extract the reads that have one of both ends unmapped with samtools
3. Map those unmapped reads to the next reference
Generate MAG-based statistics with coverm
Generate MAG-independent statistics with singlem and nonpareil
Assign taxonomically reads with kraken2
Generate lots of reports in the reports/ folder

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
.github/workflows		.github/workflows
config		config
profile		profile
resources		resources
workflow		workflow
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.snakemake-workflow-catalog.yml		.snakemake-workflow-catalog.yml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
rulegraph_simple.dot		rulegraph_simple.dot
rulegraph_simple.svg		rulegraph_simple.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snakemake workflow: `mg_quant`

Usage

Rulegraph

Brief description

References

About

Releases 8

Packages

Contributors 4

Languages

License

3d-omics/mg_quant

Folders and files

Latest commit

History

Repository files navigation

Snakemake workflow: mg_quant

Usage

Rulegraph

Brief description

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 8

Packages 0

Contributors 4

Languages

Snakemake workflow: `mg_quant`

Packages