Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkable: Don't recalculate next_check while processing remotely genrated check #10168

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

Al2Klimov
Copy link
Member

…enrated check

Currently, when processing a `CheckResult`, it will first trigger an
`OnNextCheckChanged` event, which is sent to all connected endpoints.
Then, when `Checkable::ProcessCheckResult()` returns, an `OnCheckResult`
event is fired, which is of course also sent to all connected endpoints.

Next, the other endpoints receive the `event::SetNextCheck` cluster
event followed by `event::CheckResult`and invoke
`checkable#SetNextCheck()` and `Checkable#CheckResult()` with the newly
received check. So they also try to recalculate the next check
themselves and invalidate the previously received next check timestamp
from the source endpoint. Since each endpoint randomly initialises its
own scheduling offset, the recalculated next check will always differ by
a split second/millisecond on each of them. As a consequence, two Icinga
DB HA instances will generate two different checksums for the same state
and causes the state histories to be fully resynchronised after a
takeover/Icinga 2 reload.
@icinga-probot icinga-probot bot added this to the 2.13.10 milestone Sep 20, 2024
@cla-bot cla-bot bot added the cla/signed label Sep 20, 2024
@icinga-probot icinga-probot bot added area/checks Check execution and results area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working labels Sep 20, 2024
@yhabteab yhabteab merged commit 14d4dd6 into support/2.13 Sep 23, 2024
26 checks passed
@yhabteab yhabteab deleted the next-check-cluster-sync-issue-2.13 branch September 23, 2024 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/checks Check execution and results area/distributed Distributed monitoring (master, satellites, clients) bug Something isn't working cla/signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants