Skip to content

fix: empty inSilicoPredictors object in GnomAD variant index #807

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Oct 10, 2024

Conversation

DSuveges
Copy link
Contributor

@DSuveges DSuveges commented Oct 2, 2024

✨ Context

The way we could extract in-silico predictors from the hail table of GnomAD variants, lead to empty rows, where no predicted scores were available. Upon creation, a filter is applied to drop these variants. This PR has no effect on schema, it is purely motivated by the desire to making the variant index dataset cleaner.

@DSuveges DSuveges linked an issue Oct 2, 2024 that may be closed by this pull request
@github-actions github-actions bot added bug Something isn't working size-S Datasource labels Oct 2, 2024
@DSuveges DSuveges marked this pull request as ready for review October 2, 2024 15:05
@DSuveges DSuveges requested a review from ireneisdoomed October 2, 2024 15:05
@DSuveges
Copy link
Contributor Author

DSuveges commented Oct 2, 2024

The branch runs, GnomAD variant index generated here: gs://gnomad_data_2/gnomad_variant_index. By looking at the inSilicoPredictors column, there's no null scores.
image

Copy link
Contributor

@ireneisdoomed ireneisdoomed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is OK, but maybe I am missing context. I thought these were coming from VEP directly, not from gnomAD?

There are two in-silico predictors that I couldn't get VEP to do: pangolin and spliceai. These are extracted from GnomAD. I know, this is quite an overcomplication of the process.

Copy link
Contributor

@ireneisdoomed ireneisdoomed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clear, thanks!

@DSuveges DSuveges merged commit 58333c0 into dev Oct 10, 2024
5 checks passed
@DSuveges DSuveges deleted the ds_3548_gnomad_variants branch October 10, 2024 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Datasource size-S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Predictors without scores in GnomAD variant index
2 participants