-
Notifications
You must be signed in to change notification settings - Fork 319
WIP: chempy refactoring #433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
491cd50
to
9233afc
Compare
d8efebe
to
5614259
Compare
5b1e43c
to
fa549d2
Compare
fa549d2
to
137f140
Compare
Won't comment on details at the moment, but currently I do have a couple of high-level comments:
|
137f140
to
be6eae8
Compare
hey @JarrettSJohnson !
This is the cost of poor code quality and lack of test coverage. Refactoring is a method to find and fix these issues.
I'm not sure I understand what you mean when you say "there now exists libraries (especially numpy) that have much higher usage". The entire cpv.py module was completely rewritten using numpy in this pull request. This new implementation is compatible with scripts in the pymol-scripts repo and other pymol modules.
done |
Even if the original code quality was poor and without sufficient test coverage, these specific issues were manifested from the refactoring process due to a couple of properties missed from the original code (which were also easily identifiable by using common functionality in PyMOL--and that's on me too for not testing the PR before merging it). I think we should emphasize code coverage a little bit more than class-level refactoring.
Might make sense to first open up an issue there to get insights/opinions from other developers/maintainers, but I'm generally in favor of removing the basic linear algebra functions from |
I'm truly sorry that my previous PR caused issues that you had to fix. However, in my opinion, this is common part of the software development process.
Adding commas to docstrings is worthless stuff, but I am trying to make the code more readable. Therefore, I don't see a problem with refactoring at the class level. |
I support Jarrett's assessment here.
Fully agree. I like the added tests and type hints from the first commit, but the In my own scripts I always used either only |
be6eae8
to
3431f50
Compare
No one argued with that @speleo3 as you wish, there are now only type annotations and tests |
308bae1
to
1373979
Compare
Hello @ye11owSub, I'm Thomas Stewart, a PyMOL developer and the current Product Manager for PyMOL at Schrödinger. I just wanted to add my thoughts to a few of your comments:
There's no need to apologize but I do hope it helps explain our general reluctance. I agree that fixing issues introduced by PR's is definitely part of the software development process and I don't want reject PR's simply on that basis. However, I would point out that finding and fixing these issues does take developer time and resources that could be spent on more productive tasks. This means that these PR's do come with a cost (reviewing, testing, maintaining, etc.), regardless of how simple they may appear. They need to be impactful enough to justify merging them into the codebase. You also mention your lack of experience using PyMOL as a user. Not to say that only heavy PyMOL users can contribute to the project, but I'm curious what your motivations are if you're not trying to address an issue with how the app currently functions. I certainly understand the benefits of clean and well-documented code, but I don't view this as being a beneficial use of your time and effort if these files should be replaced completely. If you (or anyone else reading this) are really interested in making a significant contribution to the project, I would encourage you to play around with PyMOL and try to identify some functionality/features that would benefit from your effort.
I certainly understand what you're saying here, but I think it's an oversimplification for the reasons I stated above. In addition to the review/maintenance costs, small changes impact All that being said, I do believe there is real value being added in this PR and the tests for PyMOL certainly should be improved. I just want to explain my thought process when evaluating PR's in general if you plan on submitting more in the future. |
Hey @TstewDev!
I understand, each PR has a cost (so let's reduce this cost through testing).
The shortest and at the same time the most complete answer is because I can. It's sad to see that, in 6 years, the project has had 80 PRs closed from the open-source community. Pymol is a popular tool for a specific group of people, I'm not one of them, but i have a CS degree and some free time
In general I agree, but in this case, I don't think that's the case. If you have a different opinion, that's fine. Let's fix/add/delete what you think is necessary or close this PR and move on. That's OK for me. I am also currently refactoring the chempy { |
m1[1][0]*m2[0][2] + m1[1][1]*m2[1][2] + m1[1][2]*m2[2][2]], | ||
[m1[2][0]*m2[0][0] + m1[2][1]*m2[1][0] + m1[2][2]*m2[2][0], | ||
m1[2][0]*m2[0][1] + m1[2][1]*m2[1][1] + m1[2][2]*m2[2][1], | ||
m1[2][0]*m2[0][2] + m1[2][1]*m2[1][2] + m1[2][2]*m2[2][2]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matrix multiplication was not implemented correctly. This and code duplication has also been fixed in this PR
Hello @ye11owSub!
Say no more, welcome the project! The effort you have already put into these PR's is really appreciated and it sounds like you really are serious about making a contribution. Please forgive my original tone of skepticism, I just know that open-source projects like this can fall victim to developers creating PR's when they have little intention of actually seeing these changes through. It definitely doesn't sound like that's the case here and we welcome all the help we can get.
I'm a big fan of adding tests like this and I think it's one of the obvious areas for improvement.
I don't actually think I have any issue with this review now that it has this refined scope. I will take another closer look and add any additional comments if necessary.
Happy to hear it! I'm normally in favor of splitting these into multiple smaller review but it sounds like these might be quite intertwined? I'll leave it up your judgement but if you feel like there's relevant context that these other changes would provide, feel free to combine them. |
d45d258
to
7932692
Compare
return sm | ||
|
||
#------------------------------------------------------------------------------ | ||
def get_nuclear_charges(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this and a bunch of other methods removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @JarrettSJohnson!
I wanted to discuss this after I finish, but it's probably better to do it as early as possible.
I couldn't find the use of these functions anywhere, so maybe it's a good idea to remove this code?
If you think this code might still be useful, then I'll restore it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a public API, so if any users out there are using them, their scripts would break. If this is intended, we should have least have a deprecation period and some alternative path to obtain a similar result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy that.
I'll restore everything that's working properly
modules/chempy/models.py
Outdated
if b.index[0] == c: | ||
indexed.bond.append(b) | ||
c = c + 1 | ||
self.reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JarrettSJohnson also some functions that are used in other places seems to me that they are not working correctly. For example:
for a in self.bond:
for b in a:
...
b in a:
- but bond is not iterable class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have examples of this not working properly? Connected::bond
is a nested list, which is a different form than Model::bond
.
What doesn't seem to be working though, for a different reason however, is Connected::insert_atom
Traceback (most recent call last):
File "D:\PyMOL\pymol\bond_test.py", line 6, in <module>
c.insert_atom(0, 0)
File "D:\mambaforge\envs\devenv\Lib\site-packages\chempy\models.py", line 612, in insert_atom
for a in self.bonds:
^^^^^^^^^^
AttributeError: 'Connected' object has no attribute 'bonds'. Did you mean: 'bond'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the part mentioned above from the convert_to_indexed function.
def convert_to_indexed(self):
if chempy.feedback['verbose']:
print(" "+str(self.__class__)+": converting to indexed model...")
indexed = Indexed()
indexed.atom = self.atom
indexed.molecule = self.molecule
c = 0
for a in self.bond:
**for b in a:**
if b.index[0] == c:
indexed.bond.append(b)
c = c + 1
self.reset()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you show me the traceback?
cmd.fetch('1obyA')
model = cmd.get_model()
c = model.convert_to_connected()
i = c.convert_to_indexed()
doesn't give me a traceback
This looks correct for the reasons I mentioned. Connected::bond
is not a list of Bond
. It's a list of list of Bond
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pymol-open-source/modules/chempy/models.py
Lines 340 to 349 in c208c62
model = Connected() | |
model.molecule = self.molecule | |
model.atom = self.atom | |
model.bond = [] | |
model.index = None | |
for a in model.atom: | |
model.bond.append([]) | |
for b in self.bond: | |
model.bond[b.index[0]].append(b) # note two refs to same object | |
model.bond[b.index[1]].append(b) # note two refs to same object |
line 348/349
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still, it doesn't work correctly. The Connected
class is inherited from the Base class, which is store bond as a list[Bond], this means that some of the methods of the base class will not work for the Connected class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two more
self = <chempy.models.Indexed object at 0x105cb8590>, index = 0
def remove_bond(self,index):
if chempy.feedback['bonds']:
print(" "+str(self.__class__)+": removing bond %d." % index)
> nBond=len(self.Bond)
E AttributeError: 'Indexed' object has no attribute 'Bond'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self = <chempy.models.Connected object at 0x1063bff90>, index = 2
def delete_atom(self,index):
if chempy.feedback['atoms']:
print(" "+str(self.__class__)+": deleting atom %d." % index)
nAtom=self.nAtom
# update index if it exists
if self.index:
idx = self.index
for k in list(idx.keys()):
if idx[k] > index:
idx[k] = idx[k] - 1
del idx[id(self.atom[index])]
# delete atom
del self.atom[index]
# delete bonds associated with this atom
nBond = len(self.bond)
for a in self.bond:
i = 0
templist = []
for b in a:
if index in b.index:
templist.append(i)
i = i + 1
for i in range(len(templist)):
j = templist[i] - i
del a[j]
# re-index bond table
for b in self.bond:
> if b.index[0] > index:
E TypeError: 'builtin_function_or_method' object is not subscriptable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, perhaps just rename the Connected::bond
to another name to avoid clashing.
nBond=len(self.Bond)
line can probably just be removed since the results are unused.
You'll likely run into several issues here. This module was written (and a lot untouched) from decades ago and rarely used by folks (which is why I think effort is probably better spent elsewhere outside this module).
IMO, just fix what you can, and anything that's not fixable, rather than deleting, you can leave them be and we'll have a proper deprecation period for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roger that
54c88e7
to
6762ce7
Compare
I probably won't be able to deeply get into this for a while, but in any case, lets keep the classes in their original files for now to keep the diff more succinct and so that it's easier to compare the pertinent changes. |
869f19d
to
4f0a7dd
Compare
4f0a7dd
to
6ec0804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. Review is a bit hard to follow when there are massive relocations/formatting changes mixed in with logic changes. I would propose in the future to keep those separate in future reviews so that it's easier to separate what's reorganization versus what I need to focus on for correctness. Otherwise, thanks for the time you spent on this!
Also whenever this is ready for review & merge, please remove the "WIP" from the title.
# HAVEN'T YET VERIFIED THAT THIS CONFORMS TO STANDARD DEFT | ||
# upd: no, it's not(fixed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be removed.
if chempy.feedback['atoms']: | ||
print(" "+str(self.__class__)+": deleting atom %d." % index) | ||
def _handle_new_atom(self) -> None: | ||
"""In case of Connected class we need to add an empty list to slef.bond""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self
cpv.py
and Numpy and shows their interchangeability