[Feature Request] Implement mechanism to fail stale search replicas #17032
Labels
enhancement
Enhancement or improvement to existing feature or request
Indexing:Replication
Issues and PRs related to core replication framework eg segrep
Indexing
Indexing, Bulk Indexing and anything related to indexing
Search:Performance
Is your feature request related to a problem? Please describe
Currently, there is no mechanism in place to automatically fail search replicas that are significantly lagging behind the primary shard and have become stale. This can lead to inconsistent search results by returning stale data.
Describe the solution you'd like
With Issue #16801, we are proposing to redefine the computation of lag. The lag will now be defined as the difference between the current time and the timestamp of the latest received checkpoint (cp). This change means we will no longer compare the lag with the primary shard directly. Instead, we need a mechanism to monitor the lag in search replica shards. If a search replica exceeds a predefined lag threshold, it should be marked as stale and automatically fail.
Related component
Search:Performance
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: