-
Notifications
You must be signed in to change notification settings - Fork 2
Germline DNA‐Seq Pipeline
module load perl
perl /path/to/pughlab_dnaseq_germline_pipeline.pl \
-t /path/to/germline_pipeline_config.yaml \
-d /path/to/germline_data_config.yaml \
--preprocessing \
--qc \
--variant_calling \
--summarize \
--create_report \
-c slurm \
--remove \
--dry-run { optional }
PROJECT
├── logs
├── TOOL1
│ ├── PATIENT-01
│ │ ├── SAMPLE-01-A
│ │ └── SAMPLE-01-B
│ └── PATIENT-02
│ └── SAMPLE-02-A
├── TOOL2
│ ├── PATIENT-01
│ │ ├── SAMPLE-01-A
│ │ └── SAMPLE-01-B
│ └── PATIENT-02
│ └── SAMPLE-02-A
└── TOOL3
-
Preprocessing: The preprocessing step will run fastqc, BWA-MEM alignments, GATK's indel realignment and BQSR functions. This step expects the data config to list and describe fastq files.
-
QC: Ths QC step will run various Picard functions (sequencing artefacts, insert size, alignment summary, etc.), GATK's depthOfCoverage and obtain an estimate of callable bases. This step expects the data config to list GATK-processed BAM files.
-
Variant-Calling: The variant-calling step will run all of the variant-calling tools (SNV/INDEL/CNV/SV) requested in the pipeline tool config yaml file. This step expects the data config to list GATK-processed BAM files.
-
Germline SNV/INDEL detection:
- HaplotypeCaller (per-BAM)
- GenotypeGVCFs + VQSR (per-cohort)
- CPSR (per patient)
-
Germline CNV detection:
- GATK's gCNV pipeline (cohort-level with outputs for each sample)
- ERDS gCNV pipeline (uses output from haplotypecaller; use carefully as we have not thoroughly validated this)
- Delly germline SVs (extracts copy-number from the DEL/DUP calls)
-
Germline SV detection:
- Delly germline SVs (includes DEL/DUP/INV/TRA/INS)
- Manta germline SVs
- SViCT (targeted panel only)
- MAVIS (combines and validates calls from the above tools)
-
Other SNV/INDEL detection tools will be run only if requested in order to produce tool-specific panel of normals:
- MuTect (v1; will be run in artefact detection mode)
- MuTect (v2; will be run in artefact detection mode)
- Strelka (will be run in germline mode)
- VarScan (will be run in tumour-only mode)
- VarDict (will be run in tumour-only mode)