Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 99% credible set validation during study_locus_validation #765

Merged
merged 9 commits into from
Sep 24, 2024

Conversation

d0choa
Copy link
Collaborator

@d0choa d0choa commented Sep 16, 2024

Includes:

  • Filter credible sets by 95% confidence intervals
  • Removes unnecessary filters in other steps

Closes opentargets/issues#3468

@addramir
Copy link
Contributor

We previously discussed 99% CS, not 95% to be more comprehensive. And actually removing it from coloc and everything else.
Moreover, this filter doesn't work for our susie credible sets, because we don't really populate thes columns (we know that these CSs are 99%), both columns is95CredibleSet and is99CredibleSet are Null.

@d0choa
Copy link
Collaborator Author

d0choa commented Sep 16, 2024

It's doable, but we must fix a few other problems then. Let's discuss this in person

@DSuveges
Copy link
Contributor

DSuveges commented Sep 17, 2024

  • Removes unnecessary filters in other steps

I like the idea of having one single point in the process where things get dropped. Although pruning the locus object would is radically different from the validation of other datasets (as it would not lead to any new flagged objects and the filtered out tags would not get anywhere unlike invalid studies or study loci. They just disappear.) It would make sense to have it in the validation and would make the resulting datasets consistent across all applications.

@addramir

Moreover, this filter doesn't work for our susie credible sets, because we don't really populate thes columns (we know that these CSs are 99%), both columns is95CredibleSet and is99CredibleSet are Null.

To be honest, I don't really like this inconsistency. Would it be possible to make all cred.set dataset similar? Similarly, if we don't care about different levels of confidence, and would keep everything 99%, we can just drop the column from the schema.

@d0choa
Copy link
Collaborator Author

d0choa commented Sep 17, 2024

I need to adjust this PR based on the new decisions described on the ticket opentargets/issues#3468

@d0choa d0choa marked this pull request as ready for review September 18, 2024 12:53
@d0choa d0choa changed the title feat: 95% credible set validation during study_locus_validation feat: 99% credible set validation during study_locus_validation Sep 18, 2024
@d0choa
Copy link
Collaborator Author

d0choa commented Sep 18, 2024

Implementing the new decisions described on the ticket opentargets/issues#3468 is pretty simple.

Now, all credible sets are annotated when we try to filter them. That would make all the logic work as long as the locus contains a populated posteriorProbability column. (including SuSie credible sets)

We need to remember that the current PICS results are filtered to 95%, so much of this will not have an effect until we re-run PICS.

@d0choa d0choa requested a review from DSuveges September 18, 2024 12:58
@github-actions github-actions bot removed the Method label Sep 24, 2024
Copy link
Contributor

@DSuveges DSuveges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All makes sense:

  • Annotating PICSed credible sets upon creation.
  • Filtering method has the logic to do the annotation as well.
  • Filtering is happening at the validation step.
  • As the dataset is already filtered, coloc doesn't need to apply filter anymore.

@d0choa d0choa merged commit 84d6638 into dev Sep 24, 2024
5 checks passed
@d0choa d0choa deleted the do_credible_set_95_validation branch September 24, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter credible sets for 99% during release process
3 participants