TheNoyesLab / SNPCall_Benchmarking Public

Notifications You must be signed in to change notification settings
Fork 0
Star 0

Repo for scripts and files used in benchmarking variant callers

0 stars 0 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
Alignment		Alignment
Benchmarking		Benchmarking
Figure_Creation		Figure_Creation
RefDB		RefDB
Simulators		Simulators
Variant_Callers		Variant_Callers
Workflow		Workflow
conda_envs		conda_envs
.gitignore		.gitignore
AccuracyBenchmarks.png		AccuracyBenchmarks.png
All_Benchmarks.csv		All_Benchmarks.csv
Benchmark_Workflow.png		Benchmark_Workflow.png
MemoryBenchmarks.png		MemoryBenchmarks.png
README.md		README.md

Repository files navigation

SNPCall_Benchmarking

Benchmarking variant callers on simulated shotgun metagenomic data. Implementing a bioinformatic pipeline from synthesizing reads to alignment and variant calling.

Methods & Workflow

Figure 1. Workflow diagram showing the variant caller benchmarking process. First, select RefSeq genomes were chosen to simulate a metagenome and random mutations were added to the genomes to create a "gold standard" dataset. Then the number of genomes used and number of reads created were adjusted to evaluate the variant callers under a range of sample conditions.

Results

Important Directories

projects/SNP_Call_Benchmarking/Benchmarking_Run:

Directory containing synthetic reads, variant caller output, and benchmarks
All output of Benchmarking Workflow goes here

SNPCall_Benchmarking/Workflow:

Production-stage scripts that form the core Benchmarking workflow.

Genesis.sh

Script performing directory setup, SNP generation, read synthesis, and alignment

MultiCaller.sh

Script running variant callers and benchmarking on data generated by Genesis.sh

Single_scripts/

Core workflow broken up into individual scripts (read generation,alignment,individual variant callers, etc)

SNP_Injector_Fasta.py

Python code in Genesis.sh used to "inject" SNPs into fasta files
Creates a log of input SNPs and genome locations

About

Repo for scripts and files used in benchmarking variant callers

Custom properties

Report repository

Releases

No releases published

Packages

No packages published

Languages