-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dev_tools/check_notebooks.py #244
Add dev_tools/check_notebooks.py #244
Conversation
Change-Id: I578c7a45d1946d89b8de1ca15d3353ccc8578f76
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! I like the approach.
"""Helper function to actually do the work in `check_notebook`. | ||
|
||
`check_notebook` has all the context managers and exception handling, | ||
which would otherwise result in highly indented code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Args to docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
'docs/hfvqe/molecular_data.ipynb', | ||
'docs/hfvqe/quickstart.ipynb', | ||
'docs/fermi_hubbard/publication_results.ipynb', | ||
'docs/fermi_hubbard/experiment_example.ipynb', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Can we instead search for *.ipynb and have exceptions if needed rather than explicitly list each notebook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost preemptively wrote a justification (excuse) for this.
Since this is so expensive, I found it very useful to be able to have some control over which notebooks are being tested. A full solution would use e.g. command line arguments to choose either a list of specific notebooks or glob for all the notebooks. The latter case would need exceptions as noted and to be careful about .ipynb_checkpoints
and/or notebooks not added to git
.
I figured this all could be done in follow up work, especially if and when this script is modified to run automatically. Especially since automated tests would need lots of control over which notebooks are being tested (e.g. "test only modified ones")
Add a script to run full notebook checks locally.
Don't we already have notebook checks?
Short: not really. Most of the notebooks get "tested" during doc building, rendering, and pushing which happens outside of GitHub. For some reason I don't get the emails when these fail but usually @MichaelBroughton will ping me.
A small portion of notebooks are too long-running, so their outputs are saved and the notebooks are not executed during doc building and are untested.
Some notebooks were written with one data generation notebook and one data consumption (analysis) notebook. The original doc build system supported this. The current doc build system requires independent notebooks. Some domains have been ported to a regime where a fetch function will download the example dataset from figshare for the analysis notebooks. This hasn't been done for all domains, so there are output-saved notebooks that are not tested.
The original notebook tests for ReCirq were done as part of the CI but coupled to the sphinx build system, which was removed cc #198.
How good of a test is this?
Short: good, but expensive. Since we want to support the flow where a user has launched an individual notebook into a vanilla colab and have everything work, the checks in this script are very expensive and realistic. It will create a new venv virtual environment for each. As such, running the full thing is expensive.
Is this part of the CI?
Short: no. This is an expensive but fully automated and complete end-to-end tests of the notebooks to save me the trouble of going through and manually checking colabs when e.g. I do a Cirq bump. The script in its current form is designed for manual execution.
To have automated checks as part of the CI, we'd likely have to do something fancy like Cirq where we only do isolated env tests in particular circumstances and/or only check changed notebooks.
But this is a good starting point for a CI notebook check.