feat: Enable secondary index for compound filter conditions #3417

islamaliev · 2025-01-30T16:25:34Z

Relevant issue(s)

Resolves #3299

Description

Utilize secondary indexes even when compound filter conditions are present.
For this to work new filter traversing utility function is introduced that can be configured to different needs.

And not that indexes are exposed to more complex conditions they started to produce more false positive docs that weren't checked by the filter because the index fetcher was not part of the new fetcher chain.

Make index fetcher implement new fetcher interface so that the documents it fetches can be checked against the scanner filter and permissions.

Change behavior of connor to recognize if a field exists. It's need to distinguish if _ne filter returns false because 2 values are different or becuase the document doesn't have the field.

Make fieldFetched explain metric count all fields fetched, not only fields that were requested.

AndrewSisley

I've only reviewed the first few files and need to go and eat :)

Looks good so far, just documentation requests.

internal/connor/key.go

internal/connor/eq.go

internal/connor/connor.go

internal/db/fetcher/indexer.go

internal/db/collection.go

internal/db/fetcher/wrapper.go

…check-compound-filter-condition

AndrewSisley

Is looking good, I'm continuing my review but it is taking a while so I thought I'd submit the outstanding comments now.

internal/connor/connor.go

internal/db/fetcher/indexer.go

internal/db/fetcher/indexer_iterators.go

AndrewSisley · 2025-02-04T18:45:23Z

internal/db/fetcher/indexer_matchers.go

+		var matcher valueMatcher
+		// we have a separate branch for null matcher because default matching behavior
+		// is what we need: for filter `_ne: null` it will match all non-null values
+		if v.IsNull() {


suggestion: This would be a lot more readable if you returned early instead of the current if-else-if-else nesting.

The current format forces the reader to read the entire function if they only care about a single if block, for example if I am debugging an issue with _, ok := v.Number(), I am forced to read through and check that matcher is not later overwritten or otherwise interacted with, whereas if it returned early I could instead leave this function and proceed further with my investigation.

for example:

if v.IsNull() { return &jsonNullMatcher{matchNull: condition.op == opEq} } if jsonVal, ok := v.Number(); ok { return &jsonComparingMatcher[float64]{ value: jsonVal, getValueFunc: func(j client.JSON) (float64, bool) { return j.Number() }, evalFunc: getCompareValsFunc[float64](condition.op), } } ... etc

the reason why it can't be early-returned can be seen below: the matcher can be wrapped into another matcher.

The json null matcher can be early-returned, but it would be the only place the is returned this way in this function and will make it look inconsistent.

I don't see any problem here with debugging.

I could extract some of the blocks in this function into smaller functions which will make it more readable and this is what I usually prefer to do, but considering that majority of the team prefers to have larger functions I decided to write it like this.

Leaving it as is, unless you'd prefer to have 2 more smaller functions.

I don't see any problem here with debugging.

When I debug an issue (not with a literal code-executing-debugger-tool - I almost never use them), I read the code, if I come across a variable of interest and want to track it over it's lifetime my job is made much easier by minimising the scope of variables.

Here, on any given execution, matcher is only ever interacted with twice I think - once on assignment, and once consumed by condition.op == opNe.

The number and complexity of the if-else blocks in here makes following this path fairly error prone (e.g. I made a mistake originally as you pointed out), and it takes a lot more effort than were it returned early.

Splitting into multiple functions is not free, and can result in the reader having to bounce around through the file, forgetting bits of context etc. however in this case, given the perceived possible scope reduction I would likely try and see what it looks like were I the author.

Is just a suggestion though, the code is readable in it's current state, I think it could be made even easier though and less error prone.

suggestion: +1 for less nesting and returning early

internal/db/fetcher/indexer_matchers.go

AndrewSisley

Looks good Islam! I'm nearly done reviewing but really need to eat :)

internal/db/fetcher/wrapper.go

internal/planner/filter/traverse.go

internal/planner/scan.go

internal/planner/select.go

AndrewSisley

Reviewed! Overall it looks really good Islam, the provided documentation was very useful when reading the code. I think all my requests are/were fairly localised, hopefully they all make sense to you :)

Thanks for resolving the issue.

internal/planner/type_join.go

AndrewSisley · 2025-02-04T20:01:48Z

tests/integration/index/array_test.go

@@ -167,7 +167,7 @@ func TestArrayIndex_WithFilterOnIndexedArrayUsingNone_ShouldUseIndex(t *testing.
 			},
 			testUtils.Request{
 				Request:  makeExplainQuery(req),
-				Asserter: testUtils.NewExplainAsserter().WithIndexFetches(9),
+				Asserter: testUtils.NewExplainAsserter().WithIndexFetches(0),


question: This is surprising, why has this changed? It looks like it is no longer using the index.

todo: If the change is correct, please update the name and/or document the test, as it doesnt make any sense to me atm.

how can this be a surprise to you? We had quite a discussion just recently :)

But the name of the test was incorrect. Thanks.

I hold very little information in my head at any given moment, and this and that are/were both quite large PRs :)

Re-reading that conversation, and the comment you added in that PR doesn't explain why this has changed in this PR - if anything it looks like it should have changed in the previous PR.

Why did this not change in the previous PR, and why is it changing here?

The previous PR ignored index when dealing with _none for json arrays. This time it's done for all arrays.

Thanks for explaining again islam, to be fair I had the same confusion as Andy (and went through that previous conversation you guys had about none before as well).

nitpick: Perhaps add a comment here documenting again 1) that if it is _none it doesn't make sense to use index, 2) Why that is?

tests/integration/index/json_test.go

tests/integration/index/query_with_index_combined_filter_test.go

tests/integration/index/query_with_index_only_filter_test.go

tests/integration/query/json/with_ne_test.go

…check-compound-filter-condition

internal/db/fetcher/indexer_matchers.go

AndrewSisley

LGTM, thanks Islam! There are a couple of open questions, and a couple of suggestions that I would appreciate resolution before merge, but none are blocking.

Code looks really good, and I am glad we are able to close this issue :)

shahzadlone

Great work, and appreciate the tests added. Also thanks for fixing the fieldFetches behavior in all the tests. Just left some suggestions and questions nothing blocking :)

internal/planner/filter/traverse.go

internal/planner/filter/traverse_test.go

internal/planner/type_join.go

shahzadlone · 2025-02-06T08:06:02Z

tests/integration/index/array_test.go

@@ -167,7 +167,7 @@ func TestArrayIndex_WithFilterOnIndexedArrayUsingNone_ShouldUseIndex(t *testing.
 			},
 			testUtils.Request{
 				Request:  makeExplainQuery(req),
-				Asserter: testUtils.NewExplainAsserter().WithIndexFetches(9),
+				Asserter: testUtils.NewExplainAsserter().WithIndexFetches(0),


Thanks for explaining again islam, to be fair I had the same confusion as Andy (and went through that previous conversation you guys had about none before as well).

nitpick: Perhaps add a comment here documenting again 1) that if it is _none it doesn't make sense to use index, 2) Why that is?

tests/integration/index/query_with_composite_index_only_filter_test.go

AndrewSisley · 2025-02-19T21:45:39Z

Bug bash result:

Unrelated bug found during testing: client.getString panics if value is not a string #3469
No related bugs found

AndrewSisley · 2025-02-19T21:46:16Z

Bug bash result: No bugs found

islamaliev and others added 30 commits January 2, 2025 10:08

Add json traversal functions

32c2440

Add GetPath method to JSON

53cb4cb

Include array element index in path

f703e3a

Fix json traversal

403c587

Add JSON and Bool encoding

1c059fb

Correctly handle paths to json nodes

c784f61

Base JSON index implementation

9855f30

Move match-related code to a file

0faa02b

Make index work for bool and string

92f7958

Add filter by json null value

516d290

Add MD file for secondary indexes

af5eba2

Add note about indexing of related docs

3dcb838

Add note about json indexing

e27d5db

Enable filtering by json bool and string

7e00694

Add unique json index

40341a0

Filter by array elements

61a7b90

Fix _in/_nin filter for json docs

32ef7bc

Add filtering on arrays of json docs

fc0eb2b

Remove filtering without array elements

70f8651

Add tests for composite index with json

cdb9d34

Enable indexing of array within json docs

adb71d4

Enable json array traversal to only top level elements

bb67d2f

Fix lint

8f24c04

Update docs

279bb69

Fix test expectations

343f5fc

Add change detector note

a56d3bf

Polish

b31c6c0

Update documentation

69a429b

Update documentation

74605db

Rename

efef1b1

islamaliev added 3 commits January 31, 2025 14:46

Add note for change detector

e9ffb04

Adjust tests

ca30e90

Adjust tests

b33415c

AndrewSisley requested changes Feb 4, 2025

View reviewed changes

islamaliev added 2 commits February 4, 2025 11:52

PR fixup

a56180c

Merge remote-tracking branch 'upstream/develop' into feat/make-index-…

cdaf7c4

…check-compound-filter-condition

islamaliev requested a review from AndrewSisley February 4, 2025 10:53

Fix lint

91ac5a9

AndrewSisley requested changes Feb 4, 2025

View reviewed changes

islamaliev added 8 commits February 5, 2025 08:06

Break up tests

8b76f91

Break up tests

fee6114

Break up tests

1308d46

Polish

0ede7ed

Polish

5f3b478

Extract some functions

c9de170

Remove unused if-else branch

f4693b1

Merge remote-tracking branch 'upstream/develop' into feat/make-index-…

cfb7bc2

…check-compound-filter-condition

islamaliev requested a review from AndrewSisley February 5, 2025 15:20

AndrewSisley reviewed Feb 5, 2025

View reviewed changes

internal/db/fetcher/indexer_matchers.go Outdated Show resolved Hide resolved

AndrewSisley approved these changes Feb 5, 2025

View reviewed changes

Polish

0f3550e

shahzadlone approved these changes Feb 6, 2025

View reviewed changes

islamaliev added 3 commits February 6, 2025 11:25

Add comments

4be111f

Extract some blocks to functions

550a974

Lint fix

208ba81

islamaliev merged commit d6b003b into sourcenetwork:develop Feb 6, 2025
42 of 43 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Enable secondary index for compound filter conditions #3417

feat: Enable secondary index for compound filter conditions #3417

islamaliev commented Jan 30, 2025

AndrewSisley left a comment

AndrewSisley left a comment

AndrewSisley Feb 4, 2025 •

edited

Loading

islamaliev Feb 5, 2025

AndrewSisley Feb 5, 2025

shahzadlone Feb 6, 2025

islamaliev Feb 6, 2025

AndrewSisley left a comment

AndrewSisley left a comment

AndrewSisley Feb 4, 2025

islamaliev Feb 5, 2025

AndrewSisley Feb 5, 2025

islamaliev Feb 6, 2025

shahzadlone Feb 6, 2025

islamaliev Feb 6, 2025

AndrewSisley left a comment

shahzadlone left a comment

shahzadlone Feb 6, 2025

AndrewSisley commented Feb 19, 2025

AndrewSisley commented Feb 19, 2025

feat: Enable secondary index for compound filter conditions #3417

feat: Enable secondary index for compound filter conditions #3417

Conversation

islamaliev commented Jan 30, 2025

Relevant issue(s)

Description

AndrewSisley left a comment

Choose a reason for hiding this comment

AndrewSisley left a comment

Choose a reason for hiding this comment

AndrewSisley Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley left a comment

Choose a reason for hiding this comment

AndrewSisley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley left a comment

Choose a reason for hiding this comment

shahzadlone left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AndrewSisley commented Feb 19, 2025

AndrewSisley commented Feb 19, 2025

AndrewSisley Feb 4, 2025 •

edited

Loading