-
Notifications
You must be signed in to change notification settings - Fork 63
fix: preparse_alignments_of3 and add tests #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
jnwei
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, some small nits regarding documentation on tets.
|
|
||
| # Check that npz files were created for both chains | ||
| npz_files = list(tmp_path.glob("*.npz")) | ||
| assert len(npz_files) == 6, ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a little more context for why 6 is the expected number of npz files? Would it make sense to check for the list of filenames instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I re-organized the tests a bit – by default the script processes the whole directory and that's not ideal (we were mixing loose alignment files with 2 PDB, which were the actual inputs)
| return CliRunner() | ||
|
|
||
| def test_preparse_databases(self, cli_runner, tmp_path): | ||
| """Test preparsing alignments with a single database (uniref90_hits).""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add the the expected input file type (e.g. .sto)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually could be both a3m/sto – I'm hesitant to write documentation for the script in the test :P What's the intent here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When reviewing, it isn't obvious to me what are the inputs to this script since as you mention later, we have a combination of inputs and outputs in the testdata/alignments directory.
It would be helpful to leave a signpost, either in the test or in the test_dir, of the expected inputs to help future readers / maintainers.
Another option could be to rearrange the test directory to have the inputs / outputs separated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense now – so this is how this directory is organized
The two directories make sense. The loose file directly under alignments/ I have no idea what these are - they're not outputs. In fact, they don't seem to be called by anything. I'm going to do an exploratory 'rm'...
Added the docstring explaining what the script does to these inputs.
vinay-swamy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
This should be good to merge after fixing a small spacing typo for directory paths. |
|
@jnwei thanks for the fix, let's merge this bad boy |
Summary
Looks like I introduced a bug during a refactor (what else is new!) in this PR #77: added some extra structure with the BaseModel but the underlying code expects a dictionary. It'd be cleaned to make everything consume the BaseModel but in the absence of tests, I just use the BaseModel for user-input validation and then dump to a dict.
Changes
Related Issues
Testing
Other Notes