Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port GRO tests to new BaseReader/Writer Test classes #1196

Merged
merged 28 commits into from
Mar 23, 2017

Conversation

utkbansal
Copy link
Member

@utkbansal utkbansal commented Feb 3, 2017

Fixes part of #516

Changes made in this Pull Request:

  • Modified GROWriter to conform to the Reader API standard (works with Universe, Timestep & AtomGroup).
  • Breaks BaseReaderTest into BaseReaderTest and MultiframeReaderTest for single frame & multi frame readers.
  • All tests now inherit from Base reader/writer test classes.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@utkbansal
Copy link
Member Author

@jbarnoud When I ran the create_data.py script, the old test files test.xyz and test.trr got changed. Is this okay? I wasn't expecting this.

@jbarnoud
Copy link
Contributor

jbarnoud commented Feb 3, 2017

I do not see why the frames were numbered from 0, and are now numbered from 1. After a quick glance at the code, I would have expected them to be numbered from 0; yet, I am not super up to date with the readers. Maybe @kain88-de has an idea?

Copy link
Member

@kain88-de kain88-de left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start looks good. But GRO files are an special in that they only store a single snapshot. This means our base test will have to be adopted for such a format.

@@ -6,28 +6,28 @@ frame 0
CA 9.00000 10.00000 11.00000
CA 12.00000 13.00000 14.00000
5
frame 0
frame 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like we fixed something in the xyz writer since we created these files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also create a new zipped file for the xyz format based on this. The readme says how this is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should be removed too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

import numpy as np


class GROReference(BaseReference):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they should be included in the main gro test file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just wanted to keep it separate while I'm working on it so it don't get it mixed. Once I'm done I'll move it to its correct place.

3ILE CA 3 9.600 11.200 12.800 0.9600 1.1200 1.2800
4LYS CA 4 14.400 16.000 17.600 1.4400 1.6000 1.7600
5LEU CA 5 19.200 20.800 22.400 1.9200 2.0800 2.2400
8.51000 8.59223 8.35003 0.00000 0.00000 0.69131 0.00000 1.45589 2.09054
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's only the last frame. But according to the docs GRO can only write one frame per file. Multiple frames are stored in multiple files. So the trajectory in gro files can only be read with the chain-reader (that works and it tested separately no need to test the chain reader in this PR)

@@ -0,0 +1,8 @@
Written by MDAnalysis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header line should include a time t = as mentioned in the gromacs docs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be an issue with the GROWriter then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t is optional according to docs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but we have that information so we should provide it.

self.ext = 'gro'
# self.volume = 0
# self.dimensions = np.zeros(6)
# self.container_format = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to deactivate all tests that skip frames or otherwise work on the assumption that a gro stores a trajectory instead of a single snapshot. For this you need to change the BaseReaderTest.

Copy link
Member Author

@utkbansal utkbansal Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kain88-de I looked into how tests are skipped and I came across the skipif decorator that we import from numpy. But then to check the ext attribute I'd have to use self which I can't. I mean I can't do something like:

@dec.skipif(self.ext == 'gro')

Is there a workaround for this? Or should I fall back to vanilla if/else statements in methods?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment. The general purpose BaseReaderTest shouldn't know anything about the specific extensions used. You should enable/disable tests based on values in the reference.

Why doesn't the skipif work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kain88-de It doesn't work because there isn't any reference to self.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably best to make BaseReaderTest a subclass of a base class for single frame readers? The TRR readers etc do everything that GROReader does, with some extra features (traj iteration)

Copy link
Member Author

@utkbansal utkbansal Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the base class has what a GROReader should have. BaseReaderTest should then inherit from it and add functionality for TRR etc. that have multiple frames? Or maybe we could have a multiple frame mixin 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@utkbansal yeah so something like

class BaseReaderTest(object):
    # define what ALL readers should do

class MultiframeReaderTest(BaseReaderTest):
   # define what extra a multiframe reader does

So the current existing BaseReaderTest is actually a mix of two sets of tests imo

@@ -0,0 +1,8 @@
Written by MDAnalysis
5
1MET CA 1 0.000 1.600 3.200 0.0000 0.1600 0.3200
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last 3 values are the velocities. You should also check that they are activated in the tests

@utkbansal
Copy link
Member Author

@kain88-de Can you have a look at the travis build errors. I don't understand the errors. Though there are errors in the code, but the traceback is different.

@kain88-de
Copy link
Member

lets see what happens when we restart the build

@utkbansal
Copy link
Member Author

utkbansal commented Feb 4, 2017

@kain88-de Same error TypeError: sequence item 0: expected string or Unicode, exceptions.TypeError found Weird thing is that no part of the traceback has any mention of my code, its all from nose

The only reference to our code is at open_files.py which I didn't touch.

@jbarnoud
Copy link
Contributor

jbarnoud commented Feb 4, 2017

The traceback is because of my nose plugin that look at open files. My last "fix" on it obviously made it more fragile. I'll deactivate it from travis.

@utkbansal
Copy link
Member Author

utkbansal commented Feb 4, 2017

@jbarnoud @kain88-de Thanks! the flag --no-open-files almost fixes the bug and I get the correct tracebacks.
So back to the original question that I wanted to ask, 7 tests are currently failing which compare values with the desired values that are set in the BaseReference. How do I know the correct desired value to set in the BaseReference?

eg, MDAnalysisTests.coordinates.test_gro2.TestGROReader.test_first_dimensions fails with the following error -

Arrays are not almost equal to 6 decimals

(mismatch 100.0%)
 x: array([ 85.100006,  86.199959,  87.300041,  75.400002,  80.400002,
        85.400024], dtype=float32)
 y: array([ 81.099998,  82.199997,  83.300003,  75.      ,  80.      ,  85.      ], dtype=float32)

Here the value y was already set in the BaseReference, my guess is that I will have to change it to an appropriate value for the GRO format. How do I get the correct value?

@richardjgowers
Copy link
Member

Hey, this is good work. I think the problem here is that the Gro file is the last frame (it gets overwritten repeatedly until the last frame). So the reference is correct, just the write method which creates the test Gro file should only write the first frame.

@utkbansal
Copy link
Member Author

@richardjgowers Okay. So I will have to modify the create_data.py script to output only the first frame in case of gro, right?

@richardjgowers
Copy link
Member

richardjgowers commented Feb 4, 2017 via email

@utkbansal
Copy link
Member Author

utkbansal commented Feb 4, 2017

@kain88-de @richardjgowers I have updated the gro file. Can you confirm if this is the first frame?

@utkbansal
Copy link
Member Author

In the last commit I added velocities to the GROReference, but I have multiplied each velocity value in the gro file by 10 to compensate for the change in units.

@kain88-de
Copy link
Member

@utkbansal It looks like you need to patch the GroWriter since it isn't following the standard right now. You will also have to split the Writer tests into a BaseWriter and MultiFrameWriter tests.

@utkbansal
Copy link
Member Author

utkbansal commented Feb 6, 2017

@kain88-de You mean the GROWriter should also have a n_atoms attribute?
Also what about TestGROReader.test_iter_as_aux_lowf , will it also be a part of MultiframeReader?

@kain88-de
Copy link
Member

You mean the GROWriter should also have a n_atoms attribute?

yes

Also what about TestGROReader.test_iter_as_aux_lowf, will it also be a part of MultiframeReader

@jbarnoud you know most about the aux reader

@jbarnoud
Copy link
Contributor

jbarnoud commented Feb 7, 2017

TestGROReader.test_iter_as_aux_lowf tests if the aux reader manage to align data and frames when the data have a lower frequency as the trajectory. As it is about iterating over trajectory frames, it is only relevant when there are multiple frames.

@utkbansal
Copy link
Member Author

@jbarnoud @kain88-de Regarding MDAnalysisTests.coordinates.test_gro2.TestGROReader.test_total_time testcase, should the value of the total time come out to be 4 or will it be 0 because we have only one frame?

@richardjgowers
Copy link
Member

@utkbansal 0

@utkbansal
Copy link
Member Author

@kain88-de I don't see any frame skipping stuff in BaseWriterTest so I'm not sure what should be shifted to MultiFrameWriter.

@kain88-de
Copy link
Member

@utkbansal the writer tests often writes multi frame trajectories. The GRO format doesn't allow to write multiple frames into one file. So the Writer tests should only check that the one frame it can write is written correctly.

@utkbansal
Copy link
Member Author

@kain88-de @richardjgowers I have added all the tests required for the GROReader class, a couple of them are failing right now and I'll need help with them.

Regarding tests for GROWriter, I think we need to add some parts to the GROWriter class to make its working consistent with what the XYZWriter does. Specifically this part in the XYZWriter has no equivalent in the GROWriter class, due to which tests in TestGROWriter are failing.

@utkbansal
Copy link
Member Author

@kain88-de @richardjgowers @jbarnoud I need help with this please.

@@ -1747,7 +1747,7 @@ class SingleFrameReader(ProtoReader):
"""
_err = "{0} only contains a single frame"

def __init__(self, filename, convert_units=None, **kwargs):
def __init__(self, filename, convert_units=None, n_atoms=None, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you make this change?

@@ -6,28 +6,28 @@ frame 0
CA 9.00000 10.00000 11.00000
CA 12.00000 13.00000 14.00000
5
frame 0
frame 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also create a new zipped file for the xyz format based on this. The readme says how this is done.

@kain88-de
Copy link
Member

Regarding tests for GROWriter, I think we need to add some parts to the GROWriter class to make its working consistent with what the XYZWriter does. Specifically this part in the XYZWriter has no equivalent in the GROWriter class, due to which tests in TestGROWriter are failing.

The GROWriter currently doesn't follow the standard API that we have. So you will need to change that one to accept and work with a Timestep, AtomGroup, or a Universe. The frame argument should be removed. Please also rename selection to obj and document that it can be any of the types specified above.

Writing a Universe or AtomGroup you should be able to use the current code. For the Timestep you have to make sure that valid names and resnames exist. The standard for unknown atom name is X and for the resname UNK. We currently already do this for AtomGroups that don't provide a names or resnames.

@utkbansal
Copy link
Member Author

@kain88-de @richardjgowers I have done a rebase and force push.

@richardjgowers
Copy link
Member

LGTM assuming tests pass

@kain88-de
Copy link
Member

@utkbansal thanks for doing this all the way to the end

@utkbansal
Copy link
Member Author

I should be the one thanking you all 😄 Without help from all of you I wouldn't be able to do this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants