title | date | lastmod |
---|---|---|
Failure Detectors |
2023-01-31 |
2023-03-08 |
- Fail-stop: can reliably detect failure (achievable in synchronous model)
- Fail-noisy: can eventually detect failure (achievable in partially synchronous model)
- Fail-silent: cannot detect between a crash or omission failure (asynchronous model)
No false negatives: all failed processes are suspected Asynchrony: suspect every node to achieve completeness
No false positives: correct processes are not suspected Asynchrony: suspect 0 nodes to achieve accuracy
No correct process is ever suspected
There exists a correct process P which is never suspected by any process
After some time, strong accuracy achieved, prior to this, any behaviour possible.
- This does not satisfy weak accuracy, as before achieving strong accuracy, any behaviour allowed
After some time, weak accuracy achieved, prior to this, any behaviour possible.
Only implementable in the Synchronous system, else there will be some incorrectly suspected processes while figuring out the actual delay.
How to achieve strong accuracy? Each time p is inaccurately suspected by a correct q
- Timeout T is increased at q
- Eventually system becomes synchronous, and T becomes larger than the unknown bound δ (T>γ+δ)
- q will receive HB on time, and never suspect p again
We want all processes to detect a single and same correct process. To do so, we need to define the set of failed processes (using a FD)
- If strong accuracy, no one is ever inaccurate, reduction never spreads inaccurate Susp
- If weak accuracy, everyone is accurate about at least one process p, no one spreads inaccurate information about p
Implement S using
- Strong completeness
$\equiv$ weak completeness: if we suspect everyone (except the leader which we know is correct), every process suspects all incorrect processes - Eventual weak accuracy: if we only trust the leader, there exists 1 correct process not suspected