-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-15487: Add queryability logic to View to prevent race to discover non-queryability #2029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…r non-queryability
Checklist before you submit for review
|
|
This test got broken: |
| // TODO why? This change reverts back to behavior from before https://github.com/datastax/cassandra/pull/1491, | ||
| // but it seems invalid. | ||
| // track sstable again: expect no rows to be read by index | ||
| // Now pass the actual sstables: expect no rows to be read by base and for the index to be non-queryable because | ||
| // preparing an sstable for rebuild removes the index from the view, which would otherwise result in | ||
| // partial results. | ||
| ((StorageAttachedIndex) cfs.getIndexManager().getIndex(indexMetadata)).getIndexContext().prepareSSTablesForRebuild(sstables); | ||
| assertEmpty(execute("SELECT * FROM %s WHERE a=1")); | ||
| assertInvalid("SELECT * FROM %s WHERE c=1"); | ||
|
|
||
| // track sstable again: expect index to remain non-queryable because it is not rebuilt (the view hasn't been updated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasonstack - I see that you wrote these test assertions in #309. We've flipped this logic a couple of times this year. First we went towards non-queryable with https://github.com/datastax/cassandra/pull/1491/files#diff-98107b1903c60731ef493d2e86f26460e40d2b0e60862db3a49ea27ba4e282fa. Then, we reverted it back to the #309 state when I merged #1700. You also commented that this test could be removed here: #1700 (comment).
So, my key question: do you have any concern with the design change to make the view object store the index's queryability state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have any concern with the design change to make the view object store the index's queryability state?
With this change, we effectively have two queryability states: one in SecondaryIndexManager (SIM) and another in View. This introduces a potential race between the two, so we need be careful the update order.
Marking queryable:
- We should mark the View first, then update SIM.
- Reason: the coordinator (which can also act as a replica) checks SIM for queryability when routing. If SIM is marked queryable before View, queries can be sent to a replica (itself) whose view isn’t actually ready, causing failures.
Marking non-queryable. There are two possible approaches
- View first, then SIM (current PR behavior) – prevents the view from serving inconsistent results. However, in-flight queries might fail if SIM is still routing but View is already non-queryable.
- SIM first, then View – avoids both inconsistencies and in-flight query failures, but adds lot more complexity to the code (It requires the in-flight query that passes SIM acquires a queryable view..).
I am +1 on keeping the code simple with current PR approach. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also commented that this test could be removed here: #1700 (comment).
I think we can remove it and rely on CNDB side tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, we effectively have two queryability states: one in SecondaryIndexManager (SIM) and another in View. This introduces a potential race between the two, so we need be careful the update order.
We already have a race. I am just proposing one way to solve it. We could alternatively make SIM get a reference to the view when checking queryability.
In general, I tend to agree with you that two ways to track the same concept is a design smell. As the new tests show, there is a race already where it is possible to get an incomplete set of sstable indexes, which is all I am trying to fix.
|
Thanks for confirming that test fails @pkolaczk. This test seems a bit contrived, but I decided to modify it to get more granular test coverage. I also posted a comment to Zhao to get his perspective. |
|
❌ Build ds-cassandra-pr-gate/PR-2029 rejected by Butler1 regressions found Found 1 new test failures
Found 2 known test failures |
| this.viewRef.set(new View(context, indices)); | ||
| // A view starts our as non-queryable because C* subsequently creates this IndexViewManager before completing | ||
| // the index build. Once the build is done, it replaces the view with a queryable one. | ||
| this.viewRef.set(new View(context, indices, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it works on CNDB which skips index build on writer.
If we create index on an empty table, existing CNDB directly marks the index as queryable. But with this change, will read request fail because view.isQueryable = false?
Can you create a CNDB PR to verify if tests are working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this comment. I was wondering if there was different logic in CNDB. I would tend to think this logic is captured in a test though, right? Here is the test pr https://github.com/riptano/cndb/pull/15531. I haven't reviewed the test failures yet, but I will now. None of them appear directly related to SAI at first glance, though.



What is the issue
Fixes https://github.com/riptano/cndb/issues/15487
What does this PR fix and why was it fixed
This PR removes a data race in the propagation of index queryability.
Description
In reviewing the code to make an index non-queryable, I noticed that we make an index non-queryable in a problematic order. The code:
cassandra/src/java/org/apache/cassandra/index/sai/StorageAttachedIndexGroup.java
Lines 394 to 401 in 4fbf95a
We update the live reference for sstables/sstable indexes and only after successfully doing that, we mark the index as non-queryable. This leaves a race where a query could get a reference to the new view of the local table's index files but the table should have been non-queryable.
The design should instead set the index as non-queryable first to prevent processing queries with an incomplete view of the data.
The net effect of the current bug is that NOT queries could get extra results and all other queries could have missing results.
The solution is to put queryability into the view. We leave the view's state untouched because it accumulates state and it would be incorrect to remove the state just because an index is (temporarily) unqueryable.