-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize tests following #167 slowdown #198
Conversation
#199 (comment) documents speed-ups in From our current
And from 076ed16
|
@rabernat, this PR shaves 15 mins off 3.8-build and 10 mins of the 3.9-build compared to We get there by refactoring the fixtures so we can use only sequential files/paths/patterns for the tests of moving files around, thereby cutting out a few hundred unnecessary tests. Here are the numbers of calls in each module, by PR:
And a visual look at test calls from the current PR (xref plot of test calls for
This seems to be good enough progress for now, without making this into a multi-day exercise, so I'm good to merge if you are. Also happy to dig a little deeper, though, if you see any other obvious places for improvement. If we merge, do you want to commit |
On further reflection, I thought of a way to possibly cut the file transfer testing time in half again. I'll push that update later tonight or first thing tomorrow. |
We're now basically back to where we were before #167 in terms of testing time and IMHO this is now ready to merge: Previously when I mentioned testing only the "sequential" files, both the In the last few commits, I refactored the fixtures again to make this Similarly, I did change
Moved this out of the PR to https://github.com/cisaacstern/pytest-param-counter. And just because I like that plot, here's were we stand: |
Noting that this also turns on duration profiling for github pytest runs (per Ryan's suggestion) and that all 10 of the slowest tests are now some version of
|
@rabernat, is there a reason why we'd want each one of these param combinations of pangeo-forge-recipes/tests/test_recipes.py Lines 188 to 205 in e748feb
to be run with each of the 5 executors pangeo-forge-recipes/tests/conftest.py Lines 331 to 332 in e748feb
I believe chunking/subsetting behavior should be independent of executor? If we run |
This
But it is an expensive test, and I agree that we should eliminate some parameterization. I recommend three steps.
|
I believe all the steps in #198 (comment) are now complete. We're now at the point where almost half the test time is setting up miniconda. Will look into https://github.com/marketplace/actions/setup-miniconda#caching now to reduce that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work Charles!
To close the loop, here's where we landed with this PR, in terms of calls per test (plot below). pangeo-forge-recipes/tests/test_recipes.py Lines 189 to 190 in c4a4ed8
is now the outlier. Just checked locally and if we run it with the new execute_recipe_python fixture only, we can save more than a minute, going from
to
any reason |
and the same could be asked of pangeo-forge-recipes/tests/test_recipes.py Lines 136 to 137 in c4a4ed8
|
Very true. Our default approach for testing is basically to run every test with every recipe and every executor. This obviously creates a lot of tests. But the good thing is that we really verify that all features work with all the different executors. We could make the test suite a lot slimmer by removing all this executor parameterization...if we are sure we can trust the executors. I would feel more comfortable doing that if we formalized the interface with the executors more. The pipelines model (#192) allows us to do that. If all of the recipes export pipelines, and then we trust the executors to execute the pipelines faithfully, then we can remove this parameterization. This would require the pipelines framework to be very well tested. Ideally, we would put pipelines into its own package, which both pangeo-forge-recipes and rechunker would depend on. We can do these things and they will help. But they are probably not your top priority right now. |
Makes a lot of sense. Thanks for the detailed reflections. If the dependencies caching looks like low-hanging fruit after a bit more reading, I may throw that in today, given just how long the miniconda setup takes. But I'll leave the parameter tweaking here for now, as I fully agree the path forward is through #192, not fiddling with the existing setup. |
To start I've just added a module
parametrization_counter.py
which returns diagnostic information on test runs.More to follow. Closes #199 when complete.