This Nextflow (v25.10.0) workflow implements a full long-read somatic variant analysis pipeline.
It supports tumor–normal and tumor-only configurations, running on Biowulf (Slurm scheduler).
Currently, the pipeline includes the following steps:
- Alignment with minimap2
- Small variant calling with Clair3
- Phasing with longphase
- Somatic SV calling with Severus
- Copy-number analysis (CNA) with Wakhan
- Somatic SNV calling with DeepSomatic
- Methylation analysis with modkit
module load nextflow/25.10.0
nextflow run tumorNormalONT.nf --tumor_reads tumor.bam --normal_reads normal.bam --reference hg38.fasta --vntr vntr.bed --clair3_model clair3_models/ont --cpgs cpgs.bedmodule load nextflow/25.10.0
nextflow run tumorOnlyONT.nf --reads tumor.bam --reference hg38.fasta --vntr vntr.bed --sv_pon PoN_1000G_hg38.tsv.gz --clair3_model clair3_models/ont --cpgs cpgs.bedTip: Always run inside an interactive session or with
sbatch— not on the Biowulf head node.
--normal_reads Path to normal BAM file (must be indexed)
--tumor_reads Path to tumor BAM file (must be indexed)
--reads Path to tumor BAM file (must be indexed)
--sv_pon Panel of Normals file (e.g. ./annot/PoN_1000G_hg38_extended.tsv.gz)
--reference Reference FASTA
--vntr BED file of tandem repeats (must be ordered)(e.g. ./annot/human_GRCh38_no_alt_analysis_set.trf.bed)
--clair3_model Path to Clair3 model
--cpgs CpG island BED file required in sv_cna_dmr and all modes (e.g. ./annot/hg38_cpg_cleaned.bed)
--mode sv_cna Run only SV and CNA calling
sv_cna_dmr Run SV, CNA, and DMR calling
all Run SV, CNA, DMR, and somatic SNV calling (default)
--cosmic Path to COSMIC genes in tsv format for Wakhan visualization
Pre-existing bam alignment, use:
--alignment false --bam BAM --bai BAI
instead of --reads
- Input reads: Must be in unmapped BAM format.
- Multiple files can be passed as a space-separated string wrapped in single quotes:
--reads_tumor 'BAM1 BAM2' - Wildcards are supported:
--reads_tumor 'BAM_DIR/*bam'
- Multiple files can be passed as a space-separated string wrapped in single quotes:
- Multiple runs: Each Nextflow run must be launched from a unique working directory.
Nextflow creates awork/folder and.nextflow.loginside the run directory. - Resuming after failure: Use the
-resumeflag (note single dash).
The workflow will attempt to reuse existing results.
Resume must be launched from the same working directory. - Output directory change By default, final outputs will be available in
lumos_outdirectory. You can change it by adding-output-dir XXXargument (note single dash) - Nexflow version Note that the workflow has been tested with Nextflow v25.10.0, and may not run with other versions due to recent synthax changes.
- To run locally, comment out
in
process.executor = 'slurm'
nextflow.config. - Job status looks like:
Outputs are in the corresponding working directory under
[11/4852c3] tumorOnlyOntWorkflow:alignMinimap2 (1) [ 0%] 0 of 1work/11/4852c3.... - Useful files inside a job’s work directory:
.command.sh→ exact command executed.command.out→ stdout.command.err→ stderr
- Each process runs inside a Singularity/Docker container.
Custom container build scripts are available in thedocker/folder.
- New to Nextflow? Start with the excellent tutorial:
👉 https://training.nextflow.io/
Notes
- Tumor-only: omit
--reads_normal; provide--sv_ponfor Severus. - Modes:
--mode sv_cnaruns Severus + Wakhan.--mode sv_cna_dmradds DMR from modkit.--mode allruns all modules including DeepSomatic.
- Consistency: tumor and normal BAMs are haplotagged by the same method to ensure consistent phasing.