Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: use sybil for doctests #178

Merged
merged 10 commits into from
Aug 14, 2024
Merged

TST: use sybil for doctests #178

merged 10 commits into from
Aug 14, 2024

Conversation

hagenw
Copy link
Member

@hagenw hagenw commented Jul 26, 2024

So far we used the following setup for ensuring the expected code output in the documentation:

  • Use the doctest-plus pytest plugin for testing the docstring examples
  • Use the jupyter-sphinx sphinx extension for running any code blocks in the usage documentation and presenting its output

The disadvantage of jupyter-sphinx is that it requires a lot of dependencies, and we cannot control if the output of the code looks as expected. We can only test if the code runs without any error. In addition, we cannot directly use ELLIPSIS to change parts of the output, e.g. to hide absolute paths or strings that are too long, but needed to write extra format functions to handle such cases. Another disadvantage was that it bundles testing and building of the documentation.

This pull request proposes to use sybil as a replacement for both doctest-plus and jupyter-sphinx.

doctest-plus would be a strong candidate to use instead of sybil as it also provides the FLOAT_CMP flag to allow to compare floats only to a certain digit. But the disadvantage of doctest-plus is that you need to write all code blocks inside >>> statements (even the hidden .. testsetup and .. testcleanup directives), which I find quite cumbersome.

sphinx.ext.doctest allows for better writing of code than doctest-plus, but it needs to be executed by sphinx instead of pytest, and it is limited to test the code inside the documentation, not inside docstrings of the Python package.

The resulting output of the usage documentation also changes slightly,
as we now present the results not as rendered tables,
but as classical doctest output, e.g.

main pull request
image image

@hagenw hagenw marked this pull request as draft July 26, 2024 10:27
Copy link

codecov bot commented Jul 29, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.0%. Comparing base (5f4a293) to head (c8b1b1d).

Additional details and impacted files
Files Coverage Δ
audinterface/conftest.py 100.0% <100.0%> (ø)

@hagenw hagenw marked this pull request as ready for review July 29, 2024 12:25
@hagenw hagenw changed the title Use sybil for doctests TST: use sybil for doctests Jul 29, 2024
Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a review, I have simply tried to verify the (nice) feataure that states that all doctests, i.e. these in docstrings AND these in the documentation, are counted.

This works, but I have one question. Before asking the question, here are the counts that I did:

# Pull Request
pytest --collect-only tests/ 433
pytest --collect-only audinterface 72
pytest --colect-only 538
pytest --collect-only docs/ 33
# main branch
pytest --collect-only audinterface 8
pytest --collect-only tests/ 433
pytest --collect-only 441
pytest --collect-only docs/ no tests

This is excellent, as the doctests that are in the documentation
get added to the discovered test.

What I cannot explain yet is the differnt amount of tests in the audinterface module. Is it possible that the old approach counts the numner of files that have doctests, and the new approach counts each "result line"?

Is there an explanation for that?

@hagenw
Copy link
Member Author

hagenw commented Aug 14, 2024

What I cannot explain yet is the differnt amount of tests in the audinterface module. Is it possible that the old approach counts the numner of files that have doctests, and the new approach counts each "result line"?

Interesting observation. I would also guess, that the difference might be related to files vs. docstrings with tests.
But I'm not completely sure, as we have 9 files (not counting conftest.py), but only 5 of those contain doctests.

$ tree audinterface/
audinterface/
├── conftest.py
├── core
│   ├── feature.py
│   ├── __init__.py
│   ├── process.py
│   ├── process_with_context.py
│   ├── segment.py
│   ├── typing.py
│   └── utils.py
├── __init__.py
└── utils
    └── __init__.py

2 directories, 10 files

The number of lines that have outputs in the doctests are 30, which is also not in line with 72, but maybe it corresponds to the number of output lines checked (I count ~80 output lines in the docstrings).

Is there an explanation for that?

The documentation for pytest-doctestplus and sybil is each not very extensive, so I'm afraid you would need to look into their code to figure out how the number is calculated.

Copy link
Member

@ChristianGeng ChristianGeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only concern that I had was the question of how the new approach counts docstrings in comparison to the old approach. This has been discussed here, and it turns out that this would require additional investigation. I think that we should not try to get to the bottom of it right now, but rather wait and observe the situation once this is merged - in the hope that the fog will clear up when monitoring the situation.

Therefore I will approve this as is for now.

@hagenw hagenw merged commit a9bc048 into main Aug 14, 2024
13 checks passed
@hagenw hagenw deleted the doctest branch August 14, 2024 08:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants