Only run sidecardb change detection on serving primary tablets #17051

GuptaManan100 · 2024-10-23T07:34:37Z

Description

We noticed that the sidecardb logic to detect schema changes in vitess internal tables is running on both the transitions of going to primary serving and primary non-serving. We think it would be a good idea to only do this when a primary is transitioning to serving state. A couple of reasons for it -

DemotePrimary is already quite a heavy operation that sometimes times out, so we shouldn't do more work on this call.
If a primary tablet already applied the DDL changes when it went into serving state, then there should be no DDLs pending to be applied when it demoting itself. The check for finding the schema diff is therefore wasted effort at that point.
In EmergencyReparentShard, we demote the primary in parallel with stopping replication on replicas. This means that if even if there were to happen a DDL, the query could just theoretically block on semi-sync (it is a race), and that would fail DemotePrimary too.

This PR makes the change of passing in the desired serving state that we are transitioning to and makes the sidecardb code only run for serving primary transitions.

Related Issue(s)

Fixes Bug Report: We run schema diff queries and changes even when a primary is going non-serving. #17060

Checklist

"Backport to:" labels have been added if this change should be back-ported to release branches
If this change is to be back-ported to previous releases, a justification is included in the PR description
Tests were added or are not required
Did the new or modified tests pass consistently locally and on CI?
Documentation was added or is not required

Deployment Notes

Signed-off-by: Manan Gupta <manan@planetscale.com>

vitess-bot · 2024-10-23T07:34:40Z

codecov · 2024-10-23T07:55:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.15%. Comparing base (17607fa) to head (2ad880e).
Report is 3 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #17051      +/-   ##
==========================================
+ Coverage   67.14%   67.15%   +0.01%     
==========================================
  Files        1571     1571              
  Lines      252060   252061       +1     
==========================================
+ Hits       169249   169282      +33     
+ Misses      82811    82779      -32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Manan Gupta <manan@planetscale.com>

arthurschreiber · 2024-10-23T10:35:36Z

I agree this is a sensible change. 👍

3. In EmergencyReparentShard, we demote the primary in parallel with stopping replication on replicas. This means that if even if there were to happen a DDL, the query could just theoretically block on semi-sync (it is a race), and that would fail DemotePrimary too.

Does this qualify this change for backporting to older releases?

shlomi-noach

This absolutely must only ever run on a Primary.

go/vt/vttablet/tabletserver/schema/engine_test.go

GuptaManan100 · 2024-10-24T01:29:05Z

Does this qualify this change for backporting to older releases?

I don't think so. During ERS, we already have writes incoming from the user too, which can potentially block on semi-sync. So, I think that problem is inherently there. I think it's a good idea to prevent DDLs like these too from blocking too, but I don't think it's serious enough to warrant backports.

feat: only run sidecardb change detection on serving primary tablets

061d520

Signed-off-by: Manan Gupta <manan@planetscale.com>

GuptaManan100 added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: schema management schemadiff and schema changes labels Oct 23, 2024

GuptaManan100 requested review from harshit-gangal, systay, shlomi-noach, rohit-nayak-ps and timvaillancourt as code owners October 23, 2024 07:34

github-actions bot added this to the v22.0.0 milestone Oct 23, 2024

test: fix test expectations

2ad880e

Signed-off-by: Manan Gupta <manan@planetscale.com>

GuptaManan100 requested a review from deepthi as a code owner October 23, 2024 10:14

arthurschreiber approved these changes Oct 23, 2024

View reviewed changes

shlomi-noach approved these changes Oct 23, 2024

View reviewed changes

rohit-nayak-ps reviewed Oct 23, 2024

View reviewed changes

go/vt/vttablet/tabletserver/schema/engine_test.go Show resolved Hide resolved

rohit-nayak-ps added the NeedsIssue A linked issue is missing for this Pull Request label Oct 23, 2024

rohit-nayak-ps approved these changes Oct 23, 2024

View reviewed changes

GuptaManan100 added Type: Bug and removed NeedsIssue A linked issue is missing for this Pull Request labels Oct 24, 2024

GuptaManan100 merged commit be0bca3 into vitessio:main Oct 24, 2024
100 of 101 checks passed

GuptaManan100 deleted the fix-sidecardb-call branch October 24, 2024 01:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only run sidecardb change detection on serving primary tablets #17051

Only run sidecardb change detection on serving primary tablets #17051

GuptaManan100 commented Oct 23, 2024 •

edited

Loading

vitess-bot bot commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

arthurschreiber commented Oct 23, 2024

shlomi-noach left a comment

GuptaManan100 commented Oct 24, 2024

Only run sidecardb change detection on serving primary tablets #17051

Only run sidecardb change detection on serving primary tablets #17051

Conversation

GuptaManan100 commented Oct 23, 2024 • edited Loading

Description

Related Issue(s)

Checklist

Deployment Notes

vitess-bot bot commented Oct 23, 2024

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

arthurschreiber commented Oct 23, 2024

shlomi-noach left a comment

Choose a reason for hiding this comment

GuptaManan100 commented Oct 24, 2024

GuptaManan100 commented Oct 23, 2024 •

edited

Loading

codecov bot commented Oct 23, 2024 •

edited

Loading