-
Notifications
You must be signed in to change notification settings - Fork 21
Automated tests ‐ overview
Automated tests are a great, we should write more of them. You may have seen similar statements when learning to code, and I think it's a true statement. However, if you tried to search "automated tests" or "how to write tests", you probably got very abstract, hard to understand answers. Here I try to give concrete reasons to motive us to write more tests, and some hints on how to do it.
There's probably more reasons than listed below, but here are the few that came to my mind.
While we develop a feature, we usually test it manually while developping, for example by clicking through our new view or calling our new function from a Django shell. However, when a feature gets bigger, manually testing everything can get quite long. It becomes very tempting to skip a part because we're sure it works or to forget test it. Tests never get tired and never forget: once written, they will always run, without the possibility of a human mistake.
Similarly, changes that we think affect only one part of the code may affect other, unexpected parts. If those other parts are tested, we are safer.
If we have to update a part of the code that we didn't know, we are often scared to do it. Even if the code we change is clearly written, it is difficult to be confident that our changes are covering all the same bases as the original version did. Automated tests can give us that confidence and with it the courage to update, refactor, improve...
Even if code is written clearly, it can be hard to deduce from the code itself what the intended effects are. By written test functions, we can give those test function names that make explicit what the code implies. For example, TestFrozenStatusService.test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse
tells us that members that are exempted from shifts should not be frozen. Reading the names of the functions that touch our target code, we can get an understanding of what that code does.
This is also great against regressions: fixing a bug may require us to organize our code in a way that looks more complicated than necessary. If we don't make explicit why the code is written like it is, someone could be tempted to simplify the code, thus re-introducing the bug that we fixed. By writing a test that ensures that the bug is not present, we also ensure that it won't be re-introduced in the future.
There are many kinds of tests. If you're new to the concept overall, you may try to search for "unit vs integration vs system tests" or so. In my experience, the answers given are very abstract and hard to understand. Here I try to give a short definition with examples. They may not be formally correct, but will hopefully let you understand the overall concept.
Within a Django project, unit tests are tests that check one single function for one specific case. An example would be TestFrozenStatusService.test_shouldFreezeMember_memberAlreadyFrozen_returnsFalse
. An integration test instead will check an entire view, for example TestMemberSelfUnregisters.test_member_self_unregisters_threshold
When writting unit tests, test a piece of code "in isolation": we want to test that piece of code and nothing else. For example if in FrozenStatusService.should_freeze_member()
we remove the check for ShiftExpectationService.is_member_expected_to_do_shifts()
and then run the tests, the only test that should fail is test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse
. All other test_shouldFreezeMember_*
should pass, thus letting us identify quickly what the problem is.
Formally, unit tests should not interact with the database, since that would mean that we test the database connection on top of our code. When possible, you can ensure that by having your test class inherit directly from SimpleTestCase. This also has the advantage of leading to faster tests. It may however require you to write a lot of mocking code to fake the database accesses. This is why, while formally not correct, Tapir unit tests often do access the database and inherit from TestCase
(through TapirFactoryTestBase
)
Where unit tests intend to check that a single isolated piece of code works as intended, integration tests try to make sure that the pieces fit together well. This usually means testing a view. A simple example is TestTapirUserSelfUpdate
, which makes sure that a member can update their displayed pronouns but not the pronouns of other members.
In TestTapirUserSelfUpdate.try_update()
, a request is sent with self.client.post
. This produces the same effects as if someone had opened their browser, filled in the field as defined in the data
parameter, and submitted the field. We can then check what the answer from the server looks like, and if the expected changes have been applied correctly.
The responses given by the test client also contain the context that was sent from the view to the template. This allows us the check that context without looking at the HTML, which would be tricky. You can see an example in TestShareOwnerList.visit_view()
There is another kind of test that we use rarely: selenium tests. The generic name would be system tests or E2E-tests (for end-to-end). In these tests, an actual django server is started, a browser is opened, and the browser is manipulated by selenium. Thus the test is as close to actual user behaviour as possible, which is nice. However, those tests are quite long to write and are quite fragile because they depend on the structure of our html pages. You can see for example how a user can login with selenium here: TapirSeleniumTestBase.login
.
You can run all tests using the following command:
docker compose run --rm web poetry run pytest
Since we have quite a few tests now, running them all can take several minutes. You can run the tests from a single folder or a single file by specifying the corresponding path, for example this command will run only the tests from the shift app or from the test_FrozenStatusService.py file:
docker compose run --rm web poetry run pytest tapir/shifts/tests[/test_FrozenStatusService.py]
You can also specify a test name, the test name being the name of the function:
docker compose run --rm web poetry run pytest -k 'test_name'
Our test runner pytest is configured to check for code coverage, which is written at the end of a run of tests. While developing it may be annoying to scroll past that everytime, so you can remove the addopts = --cov=tapir
line from pytest.ini
.
GitHub will automatically run all the tests when commits are pushed. This behaviour is among others defined in .github/workflows/tests.yml
.
This normally costs "workflow minutes", which are limited to 2000 per month on a free plan, but since our organization has the open source plan, we have unlimitted minutes. You may run into that limit if you make a fork to your organization.
When writting tests, we often need to create instances of our models: a shift to register to, a user to check the permissions of... We could do it with Model.objects.create
, but this has several drawbacks:
- We may need to define field values that are required by the model but that are irrelevant to our test, making it less clear what the tested fields are
- If a required field is added to a model, we would need to update all the calls to
create()
that have been written until now.
That's why we use Factories instead. They are defined in tests/factories.py
for each app. Once a factory is defined, we can create valid objects with Factory.create()
. The created object has random but "life-like" field values. For example ShiftFactory.create()
will create a Shift object with a random name and date, with the duration being random between 1 and 4 hours.
You can set the value of a field to something explicit instead of letting it be randomly generated by calling Factory.create(field=name)
. This way lets us define only the fields that are relevant for our test. For example, in TestExemptions.test_invalid_attendances_are_not_affected_by_exemptions
, we create several shifts, but we only care about the start_time, the name is irrelevant.
We want our fake field values to be random so that we don't have to define them, but we also want them to be the same everytime we run the tests. Otherwise we could have tests that sometimes pass and sometimes not, depending on what values got generated. To prevent that, we set the randomization seed with TapirFactoryTestBase.setUp()
.
Factories have a create()
and a build()
function. The create()
one creates the object in the database, while build()
doesn't. If you know that your test won't access the database, using build()
will make that explicit and make the test faster.
Mocking is an important part of testing. It is especially useful when writing unit tests. For example, in TestFrozenStatusService.test_freezeMemberAndSendMail()
, we want to make sure that the freeze_member_and_send_email()
function also calls the _update_attendance_mode_and_create_log_entry()
function. We are already testing _update_attendance_mode_and_create_log_entry()
in test_updateAttendanceModeAndCreateLogEntry
, so we know that it works. We also would like that, if _update_attendance_mode_and_create_log_entry()
breaks, only the test_updateAttendanceModeAndCreateLogEntry()
test fails. test_freezeMemberAndSendMail
should not fail, to make it clear where the error is.
That's why we mock _update_attendance_mode_and_create_log_entry
in test_freezeMemberAndSendMail
. This is gives us a mock function that we can controll how many times and with which parameters it was called.
Mocking is a fairly deep topic, search for @patch
and @patch.object
to get examples.
The naming convention for test functions is: test_[ELEMENT_BEING_TESTED]_[CASE_BEING_TESTED]_[EXPECTED_RESULT]
. For example: test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse
.
Since we use underscores to separate the 3 parts of the function name, and python naming convention for function is snake case, we often can't use the exact function name in the test name. Replace snake case with camel case.
Sometimes, we want to test things that depend on what time or what date is now, compared to some values in the database. For example in TestWelcomeDeskMessages.test_is_paused
, we create a MembershipPause with fixed start and end dates. The behaviour we want to test depends on wether today's date is inside the pause or outside. To make sure our test works regardless of when it is run, we use mock_timezone_now
.