Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions config/day_profiles/local/templates/rule_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,17 @@ deepsomatic:
numa: " OMP_NUM_THREADS=8 OMP_PROC_BIND=close OMP_PLACES=threads OMP_PROC_BIND=TRUE OMP_DYNAMIC=TRUE OMP_MAX_ACTIVE_LEVELS=1 OMP_SCHEDULE=dynamic OMP_WAIT_POLICY=ACTIVE "
dvsom_conda: "../envs/vanilla_v0.1.yaml"

neusomatic:
threads: 8
env_yaml: "../envs/vanilla_v0.1.yaml"
container: "docker://bioinform/neusomatic:0.2.1"
partition: "i8"
mem_mb: 60000
hg38_neusom_chrms: "21,22"
hg38_broad_neusom_chrms: "21,22"
b37_neusom_chrms: "21,22"
numa: ""

duphold:
threads: 7
env_yaml: "../envs/duphold_v0.1.yaml"
Expand Down
11 changes: 11 additions & 0 deletions config/day_profiles/slurm/templates/rule_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,17 @@ deepsomatic:
numa: " OMP_THREADS=64 OMP_PROC_BIND=close OMP_PLACES=threads OMP_DYNAMIC=true OMP_MAX_ACTIVE_LEVELS=1 OMP_SCHEDULE=dynamic OMP_WAIT_POLICY=ACTIVE "
dvsom_conda: "../envs/vanilla_v0.1.yaml"

neusomatic:
threads: 42
env_yaml: "../envs/vanilla_v0.1.yaml"
container: "docker://bioinform/neusomatic:0.2.1"
partition: "i192,i128,i192mem"
mem_mb: 85000
hg38_neusom_chrms: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24"
hg38_broad_neusom_chrms: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24"
b37_neusom_chrms: "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24"
numa: ""



duphold:
Expand Down
1 change: 1 addition & 0 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ include: "rules/deepvariant_1_5.smk"
include: "rules/deepvariant_1_9.smk"
include: "rules/deepvariant_ug.smk"
include: "rules/deepsomatic.smk"
include: "rules/neusomatic.smk"
include: "rules/doppel_mrkdups.smk"
include: "rules/duphold.smk"
include: "rules/dysgu_sv.smk"
Expand Down
96 changes: 96 additions & 0 deletions workflow/rules/neusomatic.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
import os

##### neusomatic
# ---------------------------


def get_neusom_ensemble_callers(wildcards):
vcfs = []
base = MDIR + f"{wildcards.sample}/align/{wildcards.alnr}/snv"
mapping = {
"mutect2": f"{base}/mutect2/{wildcards.sample}.{wildcards.alnr}.mutect2.vcf",
"strelka2": f"{base}/strelka2/{wildcards.sample}.{wildcards.alnr}.strelka2.vcf",
"vardict": f"{base}/vardict/{wildcards.sample}.{wildcards.alnr}.vardict.vcf",
"varscan2": f"{base}/varscan2/{wildcards.sample}.{wildcards.alnr}.varscan2.vcf",
}
for caller in ["mutect2", "strelka2", "vardict", "varscan2"]:
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded list of callers is duplicated from the mapping dictionary keys. Consider extracting this to avoid duplication: for caller in mapping.keys():

Suggested change
for caller in ["mutect2", "strelka2", "vardict", "varscan2"]:
for caller in mapping.keys():

Copilot uses AI. Check for mistakes.
if caller in somatic_snv_CALLERS:
vcfs.append(mapping[caller])
return vcfs


rule neusomatic:
wildcard_constraints:
sample=TUMORS_REGEX
input:
tumor_cram=get_somcall_tumor_cram,
tumor_crai=get_somcall_tumor_crai,
normal_cram=get_somcall_normal_cram,
normal_crai=get_somcall_normal_crai,
ref_fa=lambda wc: config["supporting_files"]["files"]["huref"]["fasta"]["name"],
ref_fai=lambda wc: config["supporting_files"]["files"]["huref"]["fasta"]["name"] + ".fai",
output:
vcf=MDIR + "{sample}/align/{alnr}/snv/neusomatic/{sample}.{alnr}.neusomatic.snv.vcf",
log:
MDIR + "{sample}/align/{alnr}/snv/neusomatic/log/{sample}.{alnr}.neusomatic.snv.log",
threads: config['neusomatic']['threads']
container:
config['neusomatic']['container']
resources:
vcpu=config['neusomatic']['threads'],
threads=config['neusomatic']['threads'],
partition=config['neusomatic']['partition'],
mem_mb=config['neusomatic']['mem_mb'],
params:
cluster_sample=ret_sample,
numa=config['neusomatic']['numa'],
shell:
r"""
set -euo pipefail
{params.numa} neusomatic.py call \
--output {output.vcf} \
--tumor {input.tumor_cram} \
--normal {input.normal_cram} \
--ref {input.ref_fa} \
--threads {threads} >> {log} 2>&1
Comment on lines +32 to +55

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Ensure output/log directories exist in neusomatic rule

The neusomatic rule writes its VCF and log under snv/neusomatic/... but never creates those directories before redirecting stdout/stderr. Snakemake does not auto-create parent directories for file outputs, so the shell command will fail with “No such file or directory” before neusomatic.py runs on a fresh sample. Other rules in this workflow explicitly mkdir -p $(dirname {output}) to avoid this. Consider creating the neusomatic and neusomatic/log directories prior to invoking the tool.

Useful? React with 👍 / 👎.

"""


rule neusomatic_ensemble:
wildcard_constraints:
sample=TUMORS_REGEX
input:
tumor_cram=get_somcall_tumor_cram,
tumor_crai=get_somcall_tumor_crai,
normal_cram=get_somcall_normal_cram,
normal_crai=get_somcall_normal_crai,
ref_fa=lambda wc: config["supporting_files"]["files"]["huref"]["fasta"]["name"],
ref_fai=lambda wc: config["supporting_files"]["files"]["huref"]["fasta"]["name"] + ".fai",
callers=get_neusom_ensemble_callers,
output:
vcf=MDIR + "{sample}/align/{alnr}/snv/neusomatic/{sample}.{alnr}.neusomatic_ensemble.snv.vcf",
log:
MDIR + "{sample}/align/{alnr}/snv/neusomatic/log/{sample}.{alnr}.neusomatic_ensemble.snv.log",
threads: config['neusomatic']['threads']
container:
config['neusomatic']['container']
resources:
vcpu=config['neusomatic']['threads'],
threads=config['neusomatic']['threads'],
partition=config['neusomatic']['partition'],
mem_mb=config['neusomatic']['mem_mb'],
params:
cluster_sample=ret_sample,
numa=config['neusomatic']['numa'],
caller_vcfs=lambda wildcards: " ".join(get_neusom_ensemble_callers(wildcards)),
shell:
r"""
set -euo pipefail
{params.numa} neusomatic.py ensemble \
--output {output.vcf} \
--tumor {input.tumor_cram} \
--normal {input.normal_cram} \
--ref {input.ref_fa} \
--callers {params.caller_vcfs} \
--threads {threads} >> {log} 2>&1
Comment on lines +70 to +95

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Missing directory creation in neusomatic_ensemble rule

The ensemble rule also writes to snv/neusomatic/... without ensuring those directories exist. When this rule runs for the first time, the redirection >> {log} and --output {output.vcf} will fail if the directories were not created by another rule, causing the job to error out before the ensemble step starts. Add a mkdir -p for the VCF and log parent paths to prevent this runtime failure.

Useful? React with 👍 / 👎.

"""
1 change: 1 addition & 0 deletions workflow/rules/rule_common.smk
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ OCTO_CHRMS = config["octopus"][f"{config['genome_build']}_octo_chrms"].split(","
CLAIR3_CHRMS = config["clair3"][f"{config['genome_build']}_clair3_chrms"].split(",")
LOFREQ_CHRMS = config["lofreq2"][f"{config['genome_build']}_lofreq_chrms"].split(",")
DVSOM_CHRMS = config["deepsomatic"][f"{config['genome_build']}_dvsom_chrms"].split(",")
NEUSOM_CHRMS = config["neusomatic"][f"{config['genome_build']}_neusom_chrms"].split(",")
SENTTN_CHRMS = config["senttn"][f"{config['genome_build']}_senttn_chrms"].split(",")

VARN_CHRMS = (
Expand Down