A Snakemake workflow for Short Variant Discovery in Host Genomes
-
Test that it works:
- Make sure you have installed snakemake, samtools and bcftools. Either
- install them with conda/mamba :
conda install -c bioconda samtools bcftools
). - or create an environment (
conda create -n 3dohg -c bioconda snakemake samtools bcftools
), and activate it (conda activate 3dohg
)
- install them with conda/mamba :
- Generate mock data with
bash workflow/scripts/generate_mock_data.sh
- Run the pipeline:
snakemake --use-conda --jobs 8 all
. It will download all the necesary software through conda. It should take less than 5 minutes.
- Make sure you have installed snakemake, samtools and bcftools. Either
-
Run it with your own data:
- Edit
config/samples.tsv
and add your samples and where are they located. - Edit
config/features.tsv
with information regarding the reference you are using. - Run the pipeline:
snakemake --use-conda --jobs 8 all
. - (slurm users):
./run_slurm
- Edit
- FASTQ processing with
fastp
- Mapping with
bowtie2
- SAM/BAM/CRAM processing with
samtools
andpicard
- Sample swap detection with
gtcheck
- SNP calling with
GATK4
- SNP annotation with
SNPEff
TBA