Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CADD annotation #266

Merged
merged 5 commits into from
Aug 9, 2024
Merged

Conversation

fellen31
Copy link
Collaborator

Closes #265

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Jul 24, 2024

nf-core lint overall result: Passed ✅

Posted for pipeline commit 6538aa6

+| ✅ 160 tests passed       |+
#| ❔  17 tests were ignored |#

❔ Tests ignored:

  • files_exist - File is ignored: CODE_OF_CONDUCT.md
  • files_exist - File is ignored: assets/nf-core-nallo_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-nallo_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-nallo_logo_dark.png
  • files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: conf/modules.config
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • files_unchanged - File ignored due to lint config: CODE_OF_CONDUCT.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File ignored due to lint config: assets/nf-core-nallo_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-nallo_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-nallo_logo_dark.png
  • modules_config - modules_config

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-08-09 12:40:09

@fellen31 fellen31 force-pushed the add-cadd branch 3 times, most recently from c6251f3 to 4c1aed1 Compare July 24, 2024 12:29
@fellen31 fellen31 force-pushed the add-cadd branch 3 times, most recently from 117ced8 to 0be186d Compare August 5, 2024 09:29
@fellen31 fellen31 marked this pull request as ready for review August 5, 2024 10:01
@fellen31 fellen31 requested a review from a team as a code owner August 5, 2024 10:01
Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Only a couple of questions

Comment on lines 1 to 14
Changes in module 'nf-core/cadd'
--- modules/nf-core/cadd/main.nf
+++ modules/nf-core/cadd/main.nf
@@ -43,7 +43,7 @@
def prefix = task.ext.prefix ?: "${meta.id}"
def VERSION = "1.6.post1" // WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions.
"""
- touch ${prefix}.tsv.gz
+ echo "" | gzip > ${prefix}.tsv.gz

cat <<-END_VERSIONS > versions.yml
"${task.process}":

************************************************************
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these changes that we should try to add to the nf-core module as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, I'll remove them from here for now.

}

withName: '.*:ANNOTATE_CADD:BCFTOOLS_VIEW' {
ext.args = { "--output-type z --types indels,other" }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity, how big are the indels in the SNV/INDEL vcf coming from ONT data? Can they be several 10s of kb and would that severely affect how long it takes to run CADD?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested this yet, will do before merging.

@fellen31 fellen31 force-pushed the add-cadd branch 15 times, most recently from 69e695c to fa15d92 Compare August 8, 2024 11:03
@fellen31 fellen31 marked this pull request as draft August 8, 2024 11:22
@fellen31 fellen31 force-pushed the add-cadd branch 14 times, most recently from 5de2edc to 9b52a94 Compare August 9, 2024 10:54
@fellen31 fellen31 marked this pull request as ready for review August 9, 2024 11:10
@fellen31
Copy link
Collaborator Author

fellen31 commented Aug 9, 2024

Added a workaround for references with chr, seems like CADD only scores 1-22,X,Y,M anyway (?) so this should be fine I think:

  1. Remove "chr" from chromosome names in VCF
  2. Extract INDELS
  3. Run CADD
  4. Annotate original VCF with CADD while renaming chromosomes back to original reference names

Copy link
Collaborator

@jemten jemten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some minor formatting
Great work 💯

Comment on lines 403 to 409
"cadd_prescored": {
"type": "string",
"exists": true,
"format": "directory-path",
"fa_icon": "fas fa-file",
"description": "Path to the directory containing cadd prescored files.",
"help_text": "This folder contains the compressed files and indexes that would otherwise be in data/prescored folder as described in https://github.com/kircherlab/CADD-scripts/#manual-installation."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we indicate that, as written currently, this file is only used to score the known indels. Might be worth clarifying that this is not the prescored snv file.

Co-authored-by: Anders Jemt <jemten@users.noreply.github.com>
@fellen31 fellen31 force-pushed the add-cadd branch 3 times, most recently from a3167b7 to 3679dc9 Compare August 9, 2024 12:36
Co-authored-by: Anders Jemt <jemten@users.noreply.github.com>
@fellen31 fellen31 merged commit 687743e into genomic-medicine-sweden:dev Aug 9, 2024
14 checks passed
@fellen31 fellen31 deleted the add-cadd branch August 9, 2024 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Add CADD 1.6 to annotate INDELs with dynamic CADD scores
2 participants