-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rpy2/r-bio3d integration for testing/R interop; memory efficient implementation for DCC computation #14
Conversation
Hi, feel free to ask, if you would like to have a review. Since you propose the |
Hi, sure; that would be great, if you find the time. |
Could you run a quick and dirty timing test? If it is not significantly slower and it passes our tests, but also provides a big memory improvement, I would say we enforce this new way: |
One additional annoyance to consider: |
From my perspective this is not a large problem, as this happens only for the tests, which most users wouldn't run anyway. Does this problem appear in during the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally the code looks good to me, I would only suggest some small style changes. I really like the new tests, it is much cleaner now!
tests/test_anm.py
Outdated
reference_fluc_subset = np.array( | ||
bio3d.fluct_nma(enm_nma_bio3d, | ||
mode_inds=r_seq(12,33) | ||
)) | ||
reference_dcc = np.array(bio3d.dccm(enm_nma_bio3d)) | ||
reference_dcc_subset = np.array( | ||
bio3d.dccm(enm_nma_bio3d, nmodes=30) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To much indentation
tests/test_anm.py
Outdated
reference_fluc = np.genfromtxt( | ||
join(data_dir(), ref_fluc), | ||
skip_header=1, delimiter="," | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much indentation
src/springcraft/nma.py
Outdated
reshaped = cov.reshape( | ||
cov.shape[0]//3, 3, -1, 3 | ||
).swapaxes(1,2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much indentation. Furtermore I would use cov.shape[0]//3, 3, cov.shape[0]//3, 3
just to reflect that both values are the same N
.
src/springcraft/nma.py
Outdated
modes_reshaped = np.reshape( | ||
eig_vectors, (len(mode_subset), -1, num_dim) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much indentation
src/springcraft/nma.py
Outdated
from springcraft import GNM | ||
from springcraft import ANM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have a reason not to use a relative import here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really; changed that.
I would propose to do a |
Refinements; DCC Updated environment.yml -> bio3d as Dev-Dependency Updated environment.yml -> bio3d as Dev-Dependency Revert "Updated environment.yml -> bio3d as Dev-Dependency" This reverts commit a4bb8de. Revert "Updated environment.yml -> bio3d as Dev-Dependency" This reverts commit 8cc2017. Add r2py/bio3d as dev dependencies Adjusted tests; incorporated rpy2
Add rpy2 and r-bio3d to pyproject.toml Added correct versions in .toml Revert "Integrated rpy2 into tests -> fix" This reverts commit 8133498. Added rpy2/r-bio3d -> fix Revert "Added correct versions in .toml" This reverts commit 02562ee. Revert "Add rpy2 and r-bio3d to pyproject.toml" This reverts commit ecc420f. Update .gitignore Added rpy2/r-bio3d to GitHub-Workflow Removed mem_eff argument for DCC
Not sure, if all indentations are correct now, but I think I'm ready. |
tests/test_anm.py
Outdated
reference_fluc = np.genfromtxt(join(data_dir(), ref_fluc), | ||
skip_header=1, delimiter="," | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other places in the code base, and as far as I have seen, most commonly in Python projects, the indentation of a line with a closing bracket bracket is the same as the corresponding opening bracket.
reference_fluc = np.genfromtxt(join(data_dir(), ref_fluc), | |
skip_header=1, delimiter="," | |
) | |
reference_fluc = np.genfromtxt( | |
join(data_dir(), ref_fluc), | |
skip_header=1, delimiter="," | |
) |
Hi, sorry for the late response. A added an example change to the review to show my preferred indentation style (see above). If you still prefer yours, that would be also fine for me, then I would merge. |
No worries, indentations should be adjusted now. |
rpy2/r-bio3d integration:
Currently, tests with bio3d and BiophysConnectoR as references consist of .csv files generated separately by R.
This offers limited flexibility when updating tests:
R scripts have to be adjusted and .csv files regenerated manually for each change.
This would quickly get unsustainable and confusing, in case tests are expanded, for example to a larger range of proteins.
With rpy2, bio3d can directly be called from within Python scripts with (rudimentary) interconversion between bio3d-PDB objects and AtomArrays.
Both packages can be readily installed via conda.
Additional dependencies for springcraft-dev are the only downsides that come to mind.
The obsolete .csv files were deleted for this pull request.
Memory efficient DCC computation variant
The previous approach to compute DCCs needs large amounts of memory for medium/large proteins.
The memory efficient variant was reasonably fast in tests and could therefore replace the older variant.
Both variants are still present for now (the older one can be toggled by mem_eff=False).