Skip to content

Conversation

iamh2o
Copy link
Contributor

@iamh2o iamh2o commented Sep 9, 2025

Summary

  • integrate VarDictJava for paired tumor/normal variant calling
  • add configuration stanzas and chromosome chunking support
  • expose rule in main workflow

Testing

  • snakemake --lint (fails: command not found)
  • pip install snakemake (fails: Could not find a version that satisfies the requirement snakemake)
  • pytest

https://chatgpt.com/codex/tasks/task_e_68bf7506e95c833192fd54c0ffd456bd

@Copilot Copilot AI review requested due to automatic review settings September 9, 2025 04:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Integrates VarDictJava somatic variant calling tool into the workflow with support for tumor/normal paired analysis and chromosome-based parallelization.

  • Adds comprehensive VarDictJava rule set with chunked processing for scalability
  • Includes configuration for different execution environments (local and SLURM)
  • Exposes VarDictJava functionality in the main workflow

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
workflow/rules/vardictjava.smk Complete VarDictJava rule implementation with variant calling, sorting, and merging
workflow/rules/rule_common.smk Adds VARDICT_CHRMS configuration parsing
workflow/Snakefile Includes VarDictJava rules in main workflow
config/day_profiles/slurm/templates/rule_config.yaml SLURM-specific VarDictJava configuration
config/day_profiles/local/templates/rule_config.yaml Local execution VarDictJava configuration
Comments suppressed due to low confidence (1)

workflow/rules/vardictjava.smk:1

  • [nitpick] Trailing commas are inconsistent - line 137 has a trailing comma after threads: 4 while line 144 has one after the conda directive. Consider removing trailing commas for consistency with other rules in the file.
import sys

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

raise Exception(
"vardict chunks can only be one contiguous range per chunk : ie: 1-4 with the non numerical chrms assigned 23=X, 24=Y,25=MT"
)
return ret_mod_chrm(ret_str)
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function ret_mod_chrm is called but not defined or imported in this file, which will cause a NameError at runtime.

Copilot uses AI. Check for mistakes.

bcftools view -O b -o $bcf --threads {threads} $vcf && bcftools index --threads 4 $bcf;
done;
touch {output};
{latency_wait};
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable latency_wait is used in shell command but not defined in params, which will cause the shell command to fail with an undefined variable error.

Copilot uses AI. Check for mistakes.

touch {output};
{latency_wait};
ls {output} >> {log} 2>&1;
{latency_wait};
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable latency_wait is used in shell command but not defined in params, which will cause the shell command to fail with an undefined variable error.

Copilot uses AI. Check for mistakes.

Comment on lines +137 to +144
threads: 4,
resources:
vcpu=4,
threads=4,
partition=config['vardictjava']['partition'],
mem_mb=config['vardictjava']['mem_mb'],
conda:
config['vardictjava']['vardict_conda'],
Copy link

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Trailing commas are inconsistent - line 137 has a trailing comma after threads: 4 while line 144 has one after the conda directive. Consider removing trailing commas for consistency with other rules in the file.

Suggested change
threads: 4,
resources:
vcpu=4,
threads=4,
partition=config['vardictjava']['partition'],
mem_mb=config['vardictjava']['mem_mb'],
conda:
config['vardictjava']['vardict_conda'],
threads: 4
resources:
vcpu=4,
threads=4,
partition=config['vardictjava']['partition'],
mem_mb=config['vardictjava']['mem_mb'],
conda:
config['vardictjava']['vardict_conda']

Copilot uses AI. Check for mistakes.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex Review: Here are some suggestions.

Reply with @codex fix comments to fix any unresolved comments.

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

Comment on lines 81 to 87
region=$(echo {params.cpre}{params.vchrm} | sed 's/~/\:/g' | sed 's/23\:/X\:/' | sed 's/24\:/Y\:/' | sed 's/25\:/{params.mito_code}\:/')
region=${region%:}

mkdir -p "$(dirname {output.vcf})"

{params.numa} vardict-java -G {params.huref} -f {params.af} -N {wildcards.sample} \
-b "{input.tumor_cram}|{input.normal_cram}" -c 1 -S 2 -E 3 -g 4 "$region" \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Build VarDict regions correctly for interval chunks

The shell command derives the region with region=$(echo {params.cpre}{params.vchrm} | sed 's/~/\:/g' ...) and passes it straight to vardict-java as "$region". When vardictchrm contains an interval such as 1~100000~200000, the sed pipeline emits chr1:100000:200000. VarDict expects the chr:start-end syntax, so the second delimiter must be converted to a hyphen before the call. As written, any attempt to use coordinate chunking will cause the rule to fail with an invalid region string. Consider mirroring the colon-to-dash handling used in other callers before invoking VarDict.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant