-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only run sidecardb change detection on serving primary tablets #17051
Conversation
Signed-off-by: Manan Gupta <manan@planetscale.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #17051 +/- ##
==========================================
+ Coverage 67.14% 67.15% +0.01%
==========================================
Files 1571 1571
Lines 252060 252061 +1
==========================================
+ Hits 169249 169282 +33
+ Misses 82811 82779 -32 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Manan Gupta <manan@planetscale.com>
I agree this is a sensible change. 👍
Does this qualify this change for backporting to older releases? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This absolutely must only ever run on a Primary.
I don't think so. During ERS, we already have writes incoming from the user too, which can potentially block on semi-sync. So, I think that problem is inherently there. I think it's a good idea to prevent DDLs like these too from blocking too, but I don't think it's serious enough to warrant backports. |
Description
We noticed that the sidecardb logic to detect schema changes in vitess internal tables is running on both the transitions of going to primary serving and primary non-serving. We think it would be a good idea to only do this when a primary is transitioning to serving state. A couple of reasons for it -
DemotePrimary
is already quite a heavy operation that sometimes times out, so we shouldn't do more work on this call.EmergencyReparentShard
, we demote the primary in parallel with stopping replication on replicas. This means that if even if there were to happen a DDL, the query could just theoretically block on semi-sync (it is a race), and that would failDemotePrimary
too.This PR makes the change of passing in the desired serving state that we are transitioning to and makes the sidecardb code only run for serving primary transitions.
Related Issue(s)
Checklist
Deployment Notes