-
Notifications
You must be signed in to change notification settings - Fork 10
feat: FoldX predicted energies in variant annotation #947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… into ds_3676_foldx
""" | ||
return f.when(f.abs(score) >= 5, f.lit(1.0)).otherwise( | ||
cls._rescaleColumnValue(f.abs(score), 0.0, 5.0, 0.0, 1.00) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the function documentation I expected that scores above 2 should be normalised to 1, but here it is 5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to be very conservative. Assigning 1.0 score to the most serious free energies, it is especially relevant when talking about the negative energies, which indicates stabilisation. That might be has negative impact on the overall structure/function. The thresholds applied for normalisation were established in a "heuristic" manner. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor question, otherwise everything looks good!
Context
OTAR2081 mapped the mutational landscape of the full human proteome by systematically changing every amio-acid to all other amino acid in the AlphaFold structures of all human protein. The mutation effect is assessed by the comparing energies of the structures via FoldX. The raw and scaled ddG values are used as
inSilicoPredictors
for all variants in the variant index that causes the same amio-acid change in the given protein.The PR contains