-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Enable secondary index for compound filter conditions #3417
feat: Enable secondary index for compound filter conditions #3417
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only reviewed the first few files and need to go and eat :)
Looks good so far, just documentation requests.
…check-compound-filter-condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is looking good, I'm continuing my review but it is taking a while so I thought I'd submit the outstanding comments now.
var matcher valueMatcher | ||
// we have a separate branch for null matcher because default matching behavior | ||
// is what we need: for filter `_ne: null` it will match all non-null values | ||
if v.IsNull() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: This would be a lot more readable if you returned early instead of the current if-else-if-else nesting.
The current format forces the reader to read the entire function if they only care about a single if block, for example if I am debugging an issue with _, ok := v.Number()
, I am forced to read through and check that matcher
is not later overwritten or otherwise interacted with, whereas if it returned early I could instead leave this function and proceed further with my investigation.
for example:
if v.IsNull() {
return &jsonNullMatcher{matchNull: condition.op == opEq}
}
if jsonVal, ok := v.Number(); ok {
return &jsonComparingMatcher[float64]{
value: jsonVal,
getValueFunc: func(j client.JSON) (float64, bool) { return j.Number() },
evalFunc: getCompareValsFunc[float64](condition.op),
}
}
... etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason why it can't be early-returned can be seen below: the matcher can be wrapped into another matcher.
The json null matcher can be early-returned, but it would be the only place the is returned this way in this function and will make it look inconsistent.
I don't see any problem here with debugging.
I could extract some of the blocks in this function into smaller functions which will make it more readable and this is what I usually prefer to do, but considering that majority of the team prefers to have larger functions I decided to write it like this.
Leaving it as is, unless you'd prefer to have 2 more smaller functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any problem here with debugging.
When I debug an issue (not with a literal code-executing-debugger-tool - I almost never use them), I read the code, if I come across a variable of interest and want to track it over it's lifetime my job is made much easier by minimising the scope of variables.
Here, on any given execution, matcher is only ever interacted with twice I think - once on assignment, and once consumed by condition.op == opNe
.
The number and complexity of the if-else blocks in here makes following this path fairly error prone (e.g. I made a mistake originally as you pointed out), and it takes a lot more effort than were it returned early.
Splitting into multiple functions is not free, and can result in the reader having to bounce around through the file, forgetting bits of context etc. however in this case, given the perceived possible scope reduction I would likely try and see what it looks like were I the author.
Is just a suggestion though, the code is readable in it's current state, I think it could be made even easier though and less error prone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: +1 for less nesting and returning early
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Islam! I'm nearly done reviewing but really need to eat :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed! Overall it looks really good Islam, the provided documentation was very useful when reading the code. I think all my requests are/were fairly localised, hopefully they all make sense to you :)
Thanks for resolving the issue.
@@ -167,7 +167,7 @@ func TestArrayIndex_WithFilterOnIndexedArrayUsingNone_ShouldUseIndex(t *testing. | |||
}, | |||
testUtils.Request{ | |||
Request: makeExplainQuery(req), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(9), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: This is surprising, why has this changed? It looks like it is no longer using the index.
todo: If the change is correct, please update the name and/or document the test, as it doesnt make any sense to me atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how can this be a surprise to you? We had quite a discussion just recently :)
But the name of the test was incorrect. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hold very little information in my head at any given moment, and this and that are/were both quite large PRs :)
Re-reading that conversation, and the comment you added in that PR doesn't explain why this has changed in this PR - if anything it looks like it should have changed in the previous PR.
Why did this not change in the previous PR, and why is it changing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous PR ignored index when dealing with _none
for json arrays. This time it's done for all arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining again islam, to be fair I had the same confusion as Andy (and went through that previous conversation you guys had about none
before as well).
nitpick: Perhaps add a comment here documenting again 1) that if it is _none it doesn't make sense to use index
, 2) Why that is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks Islam! There are a couple of open questions, and a couple of suggestions that I would appreciate resolution before merge, but none are blocking.
Code looks really good, and I am glad we are able to close this issue :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, and appreciate the tests added. Also thanks for fixing the fieldFetches
behavior in all the tests. Just left some suggestions and questions nothing blocking :)
@@ -167,7 +167,7 @@ func TestArrayIndex_WithFilterOnIndexedArrayUsingNone_ShouldUseIndex(t *testing. | |||
}, | |||
testUtils.Request{ | |||
Request: makeExplainQuery(req), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(9), | |||
Asserter: testUtils.NewExplainAsserter().WithIndexFetches(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining again islam, to be fair I had the same confusion as Andy (and went through that previous conversation you guys had about none
before as well).
nitpick: Perhaps add a comment here documenting again 1) that if it is _none it doesn't make sense to use index
, 2) Why that is?
Bug bash result:
|
Bug bash result: No bugs found |
Relevant issue(s)
Resolves #3299
Description
Utilize secondary indexes even when compound filter conditions are present.
For this to work new filter traversing utility function is introduced that can be configured to different needs.
And not that indexes are exposed to more complex conditions they started to produce more false positive docs that weren't checked by the filter because the index fetcher was not part of the new fetcher chain.
Make index fetcher implement new fetcher interface so that the documents it fetches can be checked against the scanner filter and permissions.
Change behavior of connor to recognize if a field exists. It's need to distinguish if _ne filter returns false because 2 values are different or becuase the document doesn't have the field.
Make fieldFetched explain metric count all fields fetched, not only fields that were requested.