-
Notifications
You must be signed in to change notification settings - Fork 1
Add VarDictJava somatic variant calling rule #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Integrates VarDictJava somatic variant calling tool into the workflow with support for tumor/normal paired analysis and chromosome-based parallelization.
- Adds comprehensive VarDictJava rule set with chunked processing for scalability
- Includes configuration for different execution environments (local and SLURM)
- Exposes VarDictJava functionality in the main workflow
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
workflow/rules/vardictjava.smk | Complete VarDictJava rule implementation with variant calling, sorting, and merging |
workflow/rules/rule_common.smk | Adds VARDICT_CHRMS configuration parsing |
workflow/Snakefile | Includes VarDictJava rules in main workflow |
config/day_profiles/slurm/templates/rule_config.yaml | SLURM-specific VarDictJava configuration |
config/day_profiles/local/templates/rule_config.yaml | Local execution VarDictJava configuration |
Comments suppressed due to low confidence (1)
workflow/rules/vardictjava.smk:1
- [nitpick] Trailing commas are inconsistent - line 137 has a trailing comma after
threads: 4
while line 144 has one after the conda directive. Consider removing trailing commas for consistency with other rules in the file.
import sys
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
raise Exception( | ||
"vardict chunks can only be one contiguous range per chunk : ie: 1-4 with the non numerical chrms assigned 23=X, 24=Y,25=MT" | ||
) | ||
return ret_mod_chrm(ret_str) |
Copilot
AI
Sep 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Function ret_mod_chrm
is called but not defined or imported in this file, which will cause a NameError at runtime.
Copilot uses AI. Check for mistakes.
bcftools view -O b -o $bcf --threads {threads} $vcf && bcftools index --threads 4 $bcf; | ||
done; | ||
touch {output}; | ||
{latency_wait}; |
Copilot
AI
Sep 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable latency_wait
is used in shell command but not defined in params, which will cause the shell command to fail with an undefined variable error.
Copilot uses AI. Check for mistakes.
touch {output}; | ||
{latency_wait}; | ||
ls {output} >> {log} 2>&1; | ||
{latency_wait}; |
Copilot
AI
Sep 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable latency_wait
is used in shell command but not defined in params, which will cause the shell command to fail with an undefined variable error.
Copilot uses AI. Check for mistakes.
threads: 4, | ||
resources: | ||
vcpu=4, | ||
threads=4, | ||
partition=config['vardictjava']['partition'], | ||
mem_mb=config['vardictjava']['mem_mb'], | ||
conda: | ||
config['vardictjava']['vardict_conda'], |
Copilot
AI
Sep 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Trailing commas are inconsistent - line 137 has a trailing comma after threads: 4
while line 144 has one after the conda directive. Consider removing trailing commas for consistency with other rules in the file.
threads: 4, | |
resources: | |
vcpu=4, | |
threads=4, | |
partition=config['vardictjava']['partition'], | |
mem_mb=config['vardictjava']['mem_mb'], | |
conda: | |
config['vardictjava']['vardict_conda'], | |
threads: 4 | |
resources: | |
vcpu=4, | |
threads=4, | |
partition=config['vardictjava']['partition'], | |
mem_mb=config['vardictjava']['mem_mb'], | |
conda: | |
config['vardictjava']['vardict_conda'] |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codex Review: Here are some suggestions.
Reply with @codex fix comments
to fix any unresolved comments.
About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".
region=$(echo {params.cpre}{params.vchrm} | sed 's/~/\:/g' | sed 's/23\:/X\:/' | sed 's/24\:/Y\:/' | sed 's/25\:/{params.mito_code}\:/') | ||
region=${region%:} | ||
|
||
mkdir -p "$(dirname {output.vcf})" | ||
|
||
{params.numa} vardict-java -G {params.huref} -f {params.af} -N {wildcards.sample} \ | ||
-b "{input.tumor_cram}|{input.normal_cram}" -c 1 -S 2 -E 3 -g 4 "$region" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Build VarDict regions correctly for interval chunks
The shell command derives the region with region=$(echo {params.cpre}{params.vchrm} | sed 's/~/\:/g' ...)
and passes it straight to vardict-java
as "$region"
. When vardictchrm
contains an interval such as 1~100000~200000
, the sed pipeline emits chr1:100000:200000
. VarDict expects the chr:start-end
syntax, so the second delimiter must be converted to a hyphen before the call. As written, any attempt to use coordinate chunking will cause the rule to fail with an invalid region string. Consider mirroring the colon-to-dash handling used in other callers before invoking VarDict.
Useful? React with 👍 / 👎.
Summary
Testing
snakemake --lint
(fails: command not found)pip install snakemake
(fails: Could not find a version that satisfies the requirement snakemake)pytest
https://chatgpt.com/codex/tasks/task_e_68bf7506e95c833192fd54c0ffd456bd