Skip observables contained in particle groups #4615

PythonFZ · 2024-06-15T07:32:37Z

This is not a fix for #4598 but enables reading files that contain data in

observables
 \-- <group1>
 |    \-- <observable2>
 |         \-- step: Integer[variable]
 |         \-- time: Float[variable]
 |         \-- value: <type>[variable][D][D]

Changes made in this Pull Request:

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

Developers certificate of origin

I certify that this contribution is covered by the LGPLv2.1+ license as defined in our LICENSE and adheres to the Developer Certificate of Origin.

📚 Documentation preview 📚: https://mdanalysis--4615.org.readthedocs.build/en/4615/

This file hasn't been touched in 9 months and I don't think there are some open PRs so there would be a chance to format the entire file using ruff which I accidently did here faafe8b - is this of interest?

pep8speaks · 2024-06-15T07:32:43Z

Hello @PythonFZ! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-07-15 13:51:03 UTC

github-actions

Hello there first time contributor! Welcome to the MDAnalysis community! We ask that all contributors abide by our Code of Conduct and that first time contributors introduce themselves on GitHub Discussions so we can get to know you. You can learn more about participating here. Please also add yourself to package/AUTHORS as part of this PR.

github-actions · 2024-06-15T07:34:42Z

Linter Bot Results:

Hi @PythonFZ! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location	Outcome
main package	⚠️ Possible failure
testsuite	⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/9940672284/job/27457930405

Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

codecov · 2024-06-15T07:40:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.59%. Comparing base (cfda8b7) to head (66b9ecb).
Report is 29 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #4615      +/-   ##
===========================================
- Coverage    93.61%   93.59%   -0.02%     
===========================================
  Files          171      183      +12     
  Lines        21243    22316    +1073     
  Branches      3934     3936       +2     
===========================================
+ Hits         19886    20886    +1000     
- Misses         898      971      +73     
  Partials       459      459

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

orbeckst · 2024-06-15T20:38:01Z

@edisj and @ljwoods2 could you have a quick look at this PR and comment?

orbeckst

I like the idea of using the logger but logging at every step is potentially a lot of output and performance hit. Furthermore, I don’t think users care. Thus I suggest you remove the logging and just pass.

We also need at least one test case.

orbeckst · 2024-06-15T21:27:48Z

@hmacdope would you be able to shepherd this PR? If not switch it to me.

…com/PythonFZ/mdanalysis into h5md-skip-per-particles-observables

ljwoods2 · 2024-06-16T01:15:04Z

testsuite/MDAnalysisTests/coordinates/test_h5md.py

+    prec = 3
+    ext = 'h5md'
+
+    @pytest.fixture(scope='class')


I may be mistaken, but I think the test_n_frames method isn't testing the issue- if the reader was going to except out due to a key error, this would happen during the line in the universe fixture:

reader = H5MDReader(H5MD_energy, convert_units=True)

when the H5MDReader's __init__ is called and the first call to _read_next_timestep is made. The test method doesn't call any code that reads a frame, it just ends up calling the reader's n_frames method. I agree with your approach to fixing the issue, just maybe have the test initialize or initialize and iterate through the reader so that the potential issue will occur in the scope of the test

Also make sure to update the CHANGELOG!

Should be fixed now. The CI failing seems not related to this PR?

Small nitpick- can you change the name of the test to make it represent what it is testing? Also, the assert statement seems unrelated to what the test is testing. Maybe write an except and pytest.fail statement upon raising a KeyError but otherwise don't have an assert statement. Here's an example from testsuite/coordinates/base.py:

def test_frame_jump_issue1942(self, ref, reader): """Test for issue 1942 (especially XDR on macOS)""" reader.rewind() try: for ii in range(ref.n_frames + 2): reader[0] except StopIteration: pytest.fail("Frame-seeking wrongly iterated (#1942)")

Otherwise LGTM

I updated the test and used pytest.fail. I hope the test structure now is a bit cleaner. Thanks for the review.

hmacdope

Great work! This is a good start until we decide what to do with the other form of observables.

hmacdope · 2024-06-18T03:46:44Z

package/MDAnalysis/coordinates/H5MD.py

@@ -231,6 +232,7 @@ class MockH5pyFile:
    HAS_H5PY = True


+


undo blank line change.

hmacdope · 2024-06-18T03:48:46Z

testsuite/MDAnalysisTests/coordinates/test_h5md.py

+        try:
+            _ = H5MDReader(H5MD_energy, convert_units=True)
+        except KeyError:
+            pytest.fail("Could not read H5MD file with observables.")


You can just have

_ = H5MDReader(H5MD_energy, convert_units=True)
As the body of the test, no need to try, except. Pytest can catch all the errors.

See also comment about warnings, we should catch suggested warning.

package/MDAnalysis/coordinates/H5MD.py

hmacdope

FWIW @PythonFZ the following will successfully read into ts.data with the key form group/observable

    def _copy_to_data(self):
        """assigns values to keys in data dictionary"""

        if "observables" in self._file:
            for key in self._file["observables"].keys():
                try:
                    # if has value as subkey read directly into data
                    if "value" in self._file["observables"][key]:
                        self.ts.data[key] = self._file["observables"][key][
                            "value"
                        ][self._frame]
                    # if value is not a subkey, read dict of subkeys
                    else:
                        for subkey in self._file["observables"][key].keys():
                            self.ts.data[key + "/" + subkey] = self._file["observables"][key][
                                subkey
                            ]["value"][self._frame]
                except KeyError:
                    warnings.warn(f"Could not read {key} from observables group, not a legal H5MD observable specification or data not ")

PythonFZ · 2024-06-18T11:49:50Z

@hmacdope What is the expected way to auto-format the code? I'm used to running either black or ruff format but once I run that on the code base it will yield to changes in most of the files. I haven't found anything at https://userguide.mdanalysis.org/stable/contributing.html?

Maybe there are some pre-commit hooks I haven't found?

edisj · 2024-06-18T18:53:01Z

testsuite/MDAnalysisTests/datafiles.py

@@ -372,6 +373,7 @@
 TPR_xvf = (_data_ref / 'cobrotoxin.tpr').as_posix()
 TRR_xvf = (_data_ref / 'cobrotoxin.trr').as_posix()
 H5MD_xvf = (_data_ref / 'cobrotoxin.h5md').as_posix()
+H5MD_energy = (_data_ref / 'cu.h5').as_posix()


is it the convention for other h5md files out there to use the *.h5 extension rather than *.h5md?

I can rename the file to *.h5md. I've seen both and personally prefer *.h5 because I was lazy and did not configure the VS Code H5WEB extension to also read *.h5md. Afaik H5MD is the only format specification for H5 files, so there is no risk of confusion.

Edit: I haven't seen a recommendation for the *.h5md suffix in the H5MD definition so I would actually prefer keeping the more general *.h5.

hmacdope · 2024-06-23T22:05:15Z

@PythonFZ lets go with .h5md as I think the missing extension may be causing some test failures. Other than that this is very close.

…icles-observables

PythonFZ · 2024-06-25T01:40:46Z

@PythonFZ lets go with .h5md as I think the missing extension may be causing some test failures. Other than that this is very close.

I renamed the files and ensured everything is properly formatted. I hope everything is good to go?

hmacdope · 2024-06-28T00:45:27Z

There seems to be some failing test on the azure runner. Can your reproduce locally @PythonFZ ?

PythonFZ · 2024-06-28T14:20:36Z

There seems to be some failing test on the azure runner. Can your reproduce locally @PythonFZ ?

All tests are passing for me locally. From what I see in the logs, it is the following test (correct me if I am wrong)

/home/vsts/work/1/s/testsuite/MDAnalysisTests/parallelism/test_multiprocessing.py

All tests from that file run locally for me. Also, these tests should not have been affected by this PR and the tests I changed also all run locally. The same seems to be true for the GH hosted runners. I'm a bit lost here on how to fix this, because I don't know the differences between the Azure runner and the GH runners.

orbeckst · 2024-06-28T23:36:19Z

This failing test is flaky and is not related to your work. It will not block the PR. OliverAm 6/28/24 um 07:20 schrieb Fabian Zills ***@***.***>: There seems to be some failing test on the azure runner. Can your reproduce locally @PythonFZ ? All tests are passing for me locally. From what I see in the logs, it is the following test (correct me if I am wrong) /home/vsts/work/1/s/testsuite/MDAnalysisTests/parallelism/test_multiprocessing.py All tests from that file run locally for me. Also, these tests should not have been affected by this PR and the tests I changed also all run locally. The same seems to be true for the GH hosted runners. I'm a bit lost here on how to fix this, because I don't know the differences between the Azure runner and the GH runners. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

hmacdope · 2024-06-30T08:33:16Z

Let me kick CI IIRC there were some failing tests on mdanalysis-CI (Azure_Tests Linux-Python310-64bit-full-wheel)

hmacdope

Congrats on your first contribution @PythonFZ! Much appreciated. Will be useful with some exciting H5MD developments around the corner (CC @ljwoods2).

hmacdope · 2024-07-02T03:37:49Z

@orbeckst are you happy with changes here?

orbeckst

Looking good overall, just two docs/comments requests.

orbeckst · 2024-07-02T19:06:51Z

package/MDAnalysis/coordinates/H5MD.py

+                                "observables"
+                            ][key][subkey]["value"][self._frame]
+                except KeyError:
+                    warnings.warn(


If it's not legal, shouldn't we be failing? Or is the reasoning that we don't overly care about observables as opposed to x/v/f?

Perhaps add a comment explaining your rationale.

Also, given that this is not tested, add a #pragma: nocover (or whatever you have to do to show that this code block isn't covered). It will tell anyone else reading the code that it's un-tested.

Tested would be better, of course, but if we don't care about observables then I don't insist on a test.

I'd be in favor of raising an Error here as well. Shall I create an additional h5md file which contains wrong data to test against this?

That would be ideal!

package/MDAnalysis/coordinates/H5MD.py

orbeckst · 2024-07-09T19:02:51Z

@PythonFZ if you can have a quick look at my comments then we should be able to merge this PR soon.

PythonFZ · 2024-07-15T13:49:05Z

@orbeckst I've added a file that tests for AttributeError. I can add another file for testing against KeyError but this is an unlikely case. The code will account for this though. Let me know if you prefer the additional file or are good with this.

I'm not sure how to account for

Add this limitation to the documentation.

If you are referring to

We should issue a warning that observables of the form observables/group1/observable1 are being ignored if present.

This should not longer be the case and all valid, non-fixed time step, H5MD datasets should work.

orbeckst · 2024-07-15T23:15:46Z

lgtm, thanks for adding the test @PythonFZ !

orbeckst · 2024-07-17T02:23:33Z

Thank you for your contribution @PythonFZ ! 🎉

skip observables contained in particle groups

f9c6ca1

github-actions bot added the Component-Readers label Jun 15, 2024

github-actions bot reviewed Jun 15, 2024

View reviewed changes

PythonFZ mentioned this pull request Jun 15, 2024

H5MD Can not read observables/<group>/<observable>/... #4598

Open

format only relevant lines

078d168

PythonFZ force-pushed the h5md-skip-per-particles-observables branch from faafe8b to 078d168 Compare June 15, 2024 07:52

PythonFZ and others added 2 commits June 15, 2024 09:58

remove whitespace

c5d2db2

Update AUTHORS

bbf4f20

orbeckst requested changes Jun 15, 2024

View reviewed changes

orbeckst assigned hmacdope Jun 15, 2024

PythonFZ and others added 4 commits June 15, 2024 23:28

address comments

bccda1c

Merge branch 'h5md-skip-per-particles-observables' of https://github.…

5376223

…com/PythonFZ/mdanalysis into h5md-skip-per-particles-observables

add test: open file

b22869f

Return Universe

e9b4353

ljwoods2 reviewed Jun 16, 2024

View reviewed changes

PythonFZ added 5 commits June 16, 2024 10:19

move tested method into the scope of the test

e86e475

update CHANGELOG

7d5eb1c

fix order

19035f5

add GitHub username

87b23d7

address comments

1cfedf4

hmacdope requested changes Jun 18, 2024

View reviewed changes

hmacdope reviewed Jun 18, 2024

View reviewed changes

PythonFZ added 2 commits June 18, 2024 11:09

lint

de7e7ba

address comments

757f850

edisj reviewed Jun 18, 2024

View reviewed changes

rename file

1d1b3df

github-actions bot removed the Component-Readers label Jun 25, 2024

PythonFZ added 4 commits June 24, 2024 21:27

Merge remote-tracking branch 'origin/develop' into h5md-skip-per-part…

f52b644

…icles-observables

line length

3ae0f26

format

225a7aa

white space

6faa14d

PythonFZ requested a review from hmacdope June 27, 2024 18:34

Update AUTHORS

e074e6b

hmacdope approved these changes Jul 2, 2024

View reviewed changes

hmacdope enabled auto-merge (squash) July 2, 2024 03:35

hmacdope disabled auto-merge July 2, 2024 03:35

orbeckst requested changes Jul 2, 2024

View reviewed changes

Merge branch 'develop' into h5md-skip-per-particles-observables

86749e3

add and test with malformed.h5md

2c1a4e8

lint

66b9ecb

orbeckst approved these changes Jul 15, 2024

View reviewed changes

orbeckst merged commit 1bf58cc into MDAnalysis:develop Jul 17, 2024
19 of 23 checks passed

PythonFZ mentioned this pull request Jul 26, 2024

use store="time" apax-hub/apax#303

Merged

Skip observables contained in particle groups #4615

Skip observables contained in particle groups #4615

Conversation

PythonFZ commented Jun 15, 2024 • edited by orbeckst Loading

Changes made in this Pull Request:

PR Checklist

Developers certificate of origin

pep8speaks commented Jun 15, 2024 • edited Loading

Comment last updated at 2024-07-15 13:51:03 UTC

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot commented Jun 15, 2024 • edited Loading

Linter Bot Results:

codecov bot commented Jun 15, 2024 • edited Loading

Codecov Report

orbeckst commented Jun 15, 2024

orbeckst left a comment

Choose a reason for hiding this comment

orbeckst commented Jun 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmacdope left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hmacdope left a comment

Choose a reason for hiding this comment

PythonFZ commented Jun 18, 2024

Choose a reason for hiding this comment

PythonFZ Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

hmacdope commented Jun 23, 2024

PythonFZ commented Jun 25, 2024

hmacdope commented Jun 28, 2024

PythonFZ commented Jun 28, 2024

orbeckst commented Jun 28, 2024 via email

hmacdope commented Jun 30, 2024

hmacdope left a comment

Choose a reason for hiding this comment

hmacdope commented Jul 2, 2024

orbeckst left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orbeckst commented Jul 9, 2024

PythonFZ commented Jul 15, 2024

orbeckst commented Jul 15, 2024

orbeckst commented Jul 17, 2024

PythonFZ commented Jun 15, 2024 •

edited by orbeckst

Loading

pep8speaks commented Jun 15, 2024 •

edited

Loading

github-actions bot commented Jun 15, 2024 •

edited

Loading

codecov bot commented Jun 15, 2024 •

edited

Loading

PythonFZ Jun 19, 2024 •

edited

Loading