Skip to content

Conversation

@winterqt
Copy link
Contributor

By default, systemd sets a limit of how many times a service can start, which means that if you have a healthcheck that runs more often than the limits, systemd will refuse to start it with a message like "Start request repeated too quickly." emitted to the journal.

I'm not familiar enough with the e2e test suite to confidently write a test for this -- hints appreciated! :)

Does this PR introduce a user-facing change?

Fixed a bug that prevented healthchecks from running consistently on time on Linux.

By default, systemd sets a limit of how many times a service can start,
which means that if you have a healthcheck that runs more often than the
limits, systemd will refuse to start it with a message like "Start request
repeated too quickly." emitted to the journal.

Signed-off-by: Winter M <winter@antithesis.com>
@mheon
Copy link
Member

mheon commented Oct 20, 2025

Good change, code LGTM
/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mheon, winterqt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2025
@mheon mheon added 5.7 and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Oct 20, 2025
@mheon
Copy link
Member

mheon commented Oct 20, 2025

For testing, I would recommend doing something like https://github.com/containers/podman/blob/main/test/e2e/healthcheck_run_test.go#L59-L76 with a shorter interval and a runtime long enough to convince systemd to fail (or, with your patch, not fail)

@Luap99
Copy link
Member

Luap99 commented Oct 20, 2025

For testing, I would recommend doing something like https://github.com/containers/podman/blob/main/test/e2e/healthcheck_run_test.go#L59-L76 with a shorter interval and a runtime long enough to convince systemd to fail (or, with your patch, not fail)

test/e2e does never run healtchecks via systemd timers, they get disabled there via env var.
This would need to go into test/system/220-healthcheck.bats. But keep in mind any timing related test will be flaky so it would need to be done in a way that it doesn't depend on timings, i.e. sleep or other waiting commands ideally.

@winterqt
Copy link
Contributor Author

Unfortunately having some trouble running the system tests on my work machine :(

Can try to write a test and run it through CI if y'all would be fine with that?

@packit-as-a-service
Copy link

Cockpit tests failed for commit 191bbff. @martinpitt, @jelly, @mvollmer please check.

@packit-as-a-service
Copy link

[NON-BLOCKING] Packit jobs failed. @containers/packit-build please check. Everyone else, feel free to ignore.

@mheon
Copy link
Member

mheon commented Oct 20, 2025

Sure, if you can't get the tests to run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants