-
Notifications
You must be signed in to change notification settings - Fork 3
[DNM] Protein mutation support in Neq Cycling Setup Unit #106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: protein-mutation-protocol
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
|
This should be somewhat ready to go, but we first need to figure out where are the utils from OpenFreeEnergy/openfe#1106 are going to end up, because those are needed for these changes to work. |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## protein-mutation-protocol #106 +/- ##
============================================================
Coverage ? 83.71%
============================================================
Files ? 16
Lines ? 1597
Branches ? 0
============================================================
Hits ? 1337
Misses ? 260
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@IAlibay this one should be ready for review. Please note that the |
|
@IAlibay I have found issues with some specific mappings. I think we need to postpone this one while I tackle these issues. I'm getting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An initial quick review - I mostlly am not very convinced with the idea of doing the template registration / partial charge assignment in a separate place and then using offmols with possibly different partial charges everywhere else.
| solvent_comp_a = ( | ||
| solvent_comps.pop() if solvent_comps else None | ||
| ) # Get the first component if exists | ||
| protein_comps_a = get_typed_components(state_a, ProteinComponent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch these to gufe.ChemicalSystem.get_components_of_type?
| ligand_a = ligand_mapping.componentA | ||
| ligand_b = ligand_mapping.componentB | ||
|
|
||
| solvent_comps = get_typed_components( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally it would be better to just asset a single solvent component in the protocol validator than to have to work around these type of things.
i.e. currently there zero cases where having more than 1 SolventComponent could make sense, indeed even with mixed solvent systems, we're likely to just create a single solvencomponent that would contain a list of smiles rather than stacking them.
| all_openff_mols = list( | ||
| chain(all_alchemical_mols.values(), common_small_mols.values()) | ||
| ) | ||
| self._assign_openff_partial_charges( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't assign partial charges at this point, you'll end up with bad things happening - i.e. variable partial charges everywhere being assigned with the wrong partial charge method.
| try: | ||
| # Minimize | ||
| openmm.LocalEnergyMinimizer.minimize(context) | ||
| # Optionally store minimized topology -- Mostly for debugging purposes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not always store it?
| # This can be populated however we want | ||
| return outputs | ||
|
|
||
| # TODO: Maybe this could be a utility function. Is this something protocol-specific? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mostly specific to this Protocol - also this should go in the Protocol validate method.
| # print(f"Free energy = {fe_estimate} +/- {fe_error}") # DEBUG | ||
|
|
||
| @pytest.mark.skip( | ||
| reason="Ambertools failing to parameterize. Review when we have full nagl." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be no reason why ambertools fails to generate charges for toluene.. this seems to be a bug not something that should wait for replacement.
|
|
||
| # Generate and register FF parameters in the system generator template | ||
| all_openff_mols = [comp.to_openff() for comp in all_small_mols] | ||
| register_ff_parameters_template( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this and then using a different set of offmols elsewhere (with different partial charges) is a bad hack imho - because if either the templating behaviour changes or anything else ends up needing partial charges that doesn't go through templates you'll end up with a bunch of offmols downstream that have the wrrong thing.
I would strongly suggest generating the offmols & charge them ahead of them and keep them around.
| If the component does not support the necessary conversion methods. | ||
| """ | ||
|
|
||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because technically protein components can try to go via openff, I wouldn't rely on an AttributeErrorr check here - I would isinstance check and enforce the correct route.
| return topology | ||
|
|
||
|
|
||
| def get_positions_from_component( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Medium term, maybe we should just add to_positions methods in the Components themselves?
| ValueError | ||
| If the atom index is not found in the topology. | ||
| """ | ||
| for residue in topology.residues(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than looping over residues and then over atoms, why not just loop over atoms, and get the residue? It would avoid you doing an extra for loop.
These changes aim to add support to be able to perform protein mutations in the current
SetupUnitthat theNonEquilibriumCyclingProtocoluses.This mostly implies making sure that the alchemical component is now not assumed to be a
SmallMoleculeComponentbut also aProteinComponent.Utility functions for achieving such purpose were written or modified accordingly. Someof these changes were needed to be made on the
openfeside of things, since we have not yet decided on migrating them tofeflow(this code base).