Skip to content

feat: FoldX predicted energies in variant annotation #947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Feb 5, 2025
Merged

Conversation

DSuveges
Copy link
Contributor

@DSuveges DSuveges commented Dec 13, 2024

Context

OTAR2081 mapped the mutational landscape of the full human proteome by systematically changing every amio-acid to all other amino acid in the AlphaFold structures of all human protein. The mutation effect is assessed by the comparing energies of the structures via FoldX. The raw and scaled ddG values are used as inSilicoPredictors for all variants in the variant index that causes the same amio-acid change in the given protein.

The PR contains

  • Logic and step to process the FoldX dataset generated by OTAR2081.
  • Logic to integrate amino-acid base in-silico predictors into the the variant index.

@github-actions github-actions bot added the Step label Dec 16, 2024
@DSuveges DSuveges marked this pull request as ready for review December 16, 2024 13:10
@DSuveges DSuveges requested a review from d0choa December 16, 2024 14:35
@DSuveges DSuveges requested a review from vivienho January 30, 2025 14:51
@DSuveges DSuveges linked an issue Feb 3, 2025 that may be closed by this pull request
"""
return f.when(f.abs(score) >= 5, f.lit(1.0)).otherwise(
cls._rescaleColumnValue(f.abs(score), 0.0, 5.0, 0.0, 1.00)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the function documentation I expected that scores above 2 should be normalised to 1, but here it is 5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to be very conservative. Assigning 1.0 score to the most serious free energies, it is especially relevant when talking about the negative energies, which indicates stabilisation. That might be has negative impact on the overall structure/function. The thresholds applied for normalisation were established in a "heuristic" manner. :D

Copy link
Contributor

@vivienho vivienho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor question, otherwise everything looks good!

@DSuveges DSuveges merged commit 5e9638a into dev Feb 5, 2025
7 checks passed
@DSuveges DSuveges deleted the ds_3676_foldx branch February 5, 2025 15:40
@DSuveges DSuveges restored the ds_3676_foldx branch February 5, 2025 15:47
@vivienho vivienho deleted the ds_3676_foldx branch February 5, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants