platform-compare

Comparison of sequencing platforms with a control dataset.

Sequencing platforms

This repository contains scripts to compare different sequencing platforms using a dataset that contains known compositions of synthetic SARS-CoV-2 viral control mixtures. The dataset is described more fully in our paper A Gold Standard Dataset and Evaluation of Methods for Lineage Abundance Estimation from Wastewater.

Brand	Platform	Primer scheme
ONT	PromethION	ARTIC V4.1
Illumina	NextSeq 2000	ARTIC V3
Complete Genomics	G99	ARTIC V3

About this repository

In the platforms/ directory, you will find subdirectories for each platform, each containing (locally) the BAMs generated for each dataset along with the scripts used to organize them in preparation for downstream analysis. These are in directories like platforms/<platform>/<background>/alignments The actual alignment files are too large to share here but instructions for how to download them can be found in the README.md files for each platform.

The plots/ directory contains any scripts (mostly jupyter notebooks) used to generate the plots used in understanding the similarities and differences between platforms.

All scripts (and relative paths) used here are expected to be run directly from within the root of this repository. Any jupyter notebook is run from the directory where it is stored.

Gathering data

Linking to BAMs

Filtered alignments were . These files were originally produced using variations of our covid-analysis pipeline currently found on our cluster at /projects/enviro_lab/scripts/covid-analysis. The exact nextflow workflow script within that directory can be determined by the table below:

analysis name	platform	paired?	aligner	script
ont	PromethION	No	minimap2	analyzeReads.nf
cg	G99	No	minimap2	analyzeReads_cg.nf
illumina_ss	MiSeq2000	No	minimap2	analyzeReads_illumina.nf
illumina_minimap2	MiSeq2000	Yes	minimap2	analyzeReads_illumina_ds.nf
illumina_bwa	MiSeq2000	Yes	bwa	analyzeReads_illumina_ds.nf

Those BAMs were then linked in thier appropriate locations by running arrange_files.sh. Note that this script uses absolute paths specific to our directory structure and may only act as a guide.

Running Freyja (necessary for plots produced in plots/abundance_analysis)

We're comparing all the results using freyja's aggregated output. This can be created by running run_freyja_multi.sh. This works best by running it in two steps:

Uncomment the code block marked (FIRST RUN) within run_freyja_multi.sh.
Comment out the section used before and uncomment the code block marked (SECOND RUN).

Running mosdepth (necessary for plots produced in plots/coverage_analysis)

To get coverage details for each sample, run get_coverage_multi.sh. This requires the conda env conda/env-mosdepth or the mosdepth executable to be in your path.

Citation

We don't yet have a citation for this repository, but that will be forthcoming. In the meantime, check out our deconvolution paper to learn more about our general process. What we did there just on ONT has been expanded here to the other sequencing platforms.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
expected_abundances		expected_abundances
freyja		freyja
platforms		platforms
plots		plots
software		software
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
prepare_envs.sh		prepare_envs.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

platform-compare

Sequencing platforms

About this repository

Gathering data

Linking to BAMs

Running Freyja (necessary for plots produced in plots/abundance_analysis)

Running mosdepth (necessary for plots produced in plots/coverage_analysis)

Citation

About

Releases

Packages

Languages

License

enviro-lab/platform-compare

Folders and files

Latest commit

History

Repository files navigation

platform-compare

Sequencing platforms

About this repository

Gathering data

Linking to BAMs

Running Freyja (necessary for plots produced in plots/abundance_analysis)

Running mosdepth (necessary for plots produced in plots/coverage_analysis)

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages