Skip to content

Commit

Permalink
mango: introduce strict index selection
Browse files Browse the repository at this point in the history
It is not always beneficial for the performance if the Mango query
planner tries to assign an index to the selector.  User-specified
indexes may save the day, but since they are only hints for the
planner, automated overrides can still happen.

Introduce the concept of "strict index selection" which lets the
user to request the exclusive use of a specific index.  When it is
not possible, give up on planning and return an HTTP 400 response
right away.  This way the user has the chance to learn about the
missing index, request its creation and try again later.

The feature comes with a configuration toggle.  By default, the
feature is disabled to maintain backward compatibility, but the
user may ask for this behavior via the new `use_index_strict`
query parameter for a specific query.  Note that, similarly,
`use_index_strict` could be used to disable strict index selection
temporarily when it is enabled on the server.

Fixes apache#4511
  • Loading branch information
pgj committed Jul 30, 2023
1 parent 1f2994e commit d6f915c
Show file tree
Hide file tree
Showing 9 changed files with 189 additions and 30 deletions.
6 changes: 6 additions & 0 deletions rel/overlay/etc/default.ini
Original file line number Diff line number Diff line change
Expand Up @@ -496,6 +496,12 @@ authentication_db = _users
; the warning.
;index_scan_warning_threshold = 10

; Set to true to make index selection strict. Indexes always have to be
; specified by the use_index parameter, they are not selected automatically
; for the query. An error is returned if the user-specified index is not
; found. This can be overridden (inverted) ad-hoc by the query parameters.
;strict_index_selection = false

[indexers]
couch_mrview = true

Expand Down
58 changes: 35 additions & 23 deletions src/docs/src/api/database/find.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@
:<json string|array use_index: Instruct a query to use a specific index.
Specified either as ``"<design_document>"`` or
``["<design_document>", "<index_name>"]``. *Optional*
:<json boolean use_index_strict: Do not perform the query unless
the index specified by ``use_index`` is found and suitable.
The use of this parameter implies the use of
``use_index``. *Optional*
:<json boolean conflicts: Include conflicted documents if ``true``.
Intended use is to easily find conflicted documents, without an
index or view. Default is ``false``. *Optional*
Expand Down Expand Up @@ -1441,13 +1445,12 @@ it easier to take advantage of future improvements to query planning
Transfer-Encoding: chunked
{
"covered": false,
"dbname": "movies",
"index": {
"ddoc": "_design/0d61d9177426b1e2aa8d0fe732ec6e506f5d443c",
"name": "0d61d9177426b1e2aa8d0fe732ec6e506f5d443c",
"partitioned": false,
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
Expand All @@ -1456,12 +1459,15 @@ it easier to take advantage of future improvements to query planning
]
}
},
"partitioned": false,
"selector": {
"year": {
"$gt": 2010
}
},
"opts": {
"use_index": [],
"use_index_strict": false,
"bookmark": "nil",
"limit": 2,
"skip": 0,
Expand All @@ -1472,30 +1478,13 @@ it easier to take advantage of future improvements to query planning
"year",
"title"
],
"partition": "",
"r": 1,
"conflicts": false,
"execution_stats": false,
"partition": "",
"stable": false,
"stale": false,
"update": true,
"use_index": []
},
"mrargs": {
"conflicts": "undefined",
"direction": "fwd",
"end_key": [
"<MAX>"
],
"include_docs": true,
"partition": null,
"reduce": false,
"stable": false,
"start_key": [
2010
],
"update": true,
"view_type": "map"
"execution_stats": false
},
"limit": 2,
"skip": 0,
Expand All @@ -1505,14 +1494,32 @@ it easier to take advantage of future improvements to query planning
"year",
"title"
],
"partitioned": false
"mrargs": {
"include_docs": true,
"view_type": "map",
"reduce": false,
"partition": null,
"start_key": [
2010
],
"end_key": [
"<MAX>"
],
"direction": "fwd",
"stable": false,
"update": true,
"conflicts": "undefined"
},
"covered": false
}
Index selection
===============

:ref:`_find <api/db/_find>` chooses which index to use for responding
to a query, unless you specify an index at query time.
to a query, unless you specify an index at query time. Note that this
is just a hint for the query planner and as such it might be ignored
if the named index is not suitable for performing the query.

The query planner looks at the selector section and finds the index with the
closest match to operators and fields used in the query. If there are two
Expand All @@ -1525,3 +1532,8 @@ the index with the first alphabetical name is chosen.
It is good practice to specify indexes explicitly in your queries. This
prevents existing queries being affected by new indexes that might get added
in a production environment.

Setting the ``use_index_strict`` parameter to ``true`` disables the
automated query planning. In that case, one must specify an index
which will be checked for usability and an error is return when it is
not specified or it is not usable.
17 changes: 17 additions & 0 deletions src/docs/src/config/query-servers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -309,3 +309,20 @@ Mango is the Query Engine that services the :ref:`_find <api/db/_find>`, endpoin

[mango]
index_scan_warning_threshold = 10

.. config:option:: strict_index_selection :: Strict index selection.
.. versionadded:: 3.4

Make the index selection strict. The query planner will not
try to find a suitable index automatically but use the
user-specified index and give up immediately in case of
failure. This implictly makes the use of the ``use_index``
query parameter mandatory. Using the ``use_index_strict``
Boolean parameter, this behavior may be adjusted per query.

Defaults to ``false``.
::

[mango]
strict_index_selection = false
87 changes: 80 additions & 7 deletions src/mango/src/mango_cursor.erl
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,26 @@
create(Db, Selector0, Opts) ->
Selector = mango_selector:normalize(Selector0),
UsableIndexes = mango_idx:get_usable_indexes(Db, Selector, Opts),
case mango_cursor:maybe_filter_indexes_by_ddoc(UsableIndexes, Opts) of
case maybe_filter_indexes_by_ddoc(UsableIndexes, Opts) of
[] ->
% use_index doesn't match a valid index - fall back to a valid one
create_cursor(Db, UsableIndexes, Selector, Opts);
% use_index doesn't match a valid index - determine how
% this shall be handlded by the settings
case (use_index_strict(Opts)) of
true ->
% return an error
Details =
case use_index(Opts) of
[] ->
[];
UseIndex ->
[DesignId | Rest] = UseIndex,
[ddoc_name(DesignId) | Rest]
end,
?MANGO_ERROR({invalid_index, Details});
false ->
% fall back to a valid one
create_cursor(Db, UsableIndexes, Selector, Opts)
end;
UserSpecifiedIndex ->
create_cursor(Db, UserSpecifiedIndex, Selector, Opts)
end.
Expand Down Expand Up @@ -93,13 +109,25 @@ execute(#cursor{index = Idx} = Cursor, UserFun, UserAcc) ->
Mod = mango_idx:cursor_mod(Idx),
Mod:execute(Cursor, UserFun, UserAcc).

use_index(Opts) ->
{use_index, UseIndex} = lists:keyfind(use_index, 1, Opts),
UseIndex.

use_index_strict(Opts) ->
case lists:keyfind(use_index_strict, 1, Opts) of
{use_index_strict, ByQuery} ->
ByQuery;
false ->
config:get_boolean("mango", "strict_index_selection", false)
end.

maybe_filter_indexes_by_ddoc(Indexes, Opts) ->
case lists:keyfind(use_index, 1, Opts) of
{use_index, []} ->
case use_index(Opts) of
[] ->
[];
{use_index, [DesignId]} ->
[DesignId] ->
filter_indexes(Indexes, DesignId);
{use_index, [DesignId, ViewName]} ->
[DesignId, ViewName] ->
filter_indexes(Indexes, DesignId, ViewName)
end.

Expand Down Expand Up @@ -263,3 +291,48 @@ ddoc_name(<<"_design/", Name/binary>>) ->
Name;
ddoc_name(Name) ->
Name.

-ifdef(TEST).
-include_lib("couch/include/couch_eunit.hrl").

use_index_strict_test_() ->
{
foreach,
fun() ->
meck:new(config)
end,
fun(_) ->
meck:unload(config)
end,
[
?TDEF_FE(t_use_index_strict_disabled_not_requested),
?TDEF_FE(t_use_index_strict_disabled_requested),
?TDEF_FE(t_use_index_strict_enabled_not_requested),
?TDEF_FE(t_use_index_strict_enabled_requested)
]
}.

t_use_index_strict_disabled_not_requested(_) ->
meck:expect(config, get_boolean, ["mango", "strict_index_selection", '_'], meck:val(false)),
Options1 = [{use_index_strict, false}],
?assertEqual(false, use_index_strict(Options1)),
Options2 = [],
?assertEqual(false, use_index_strict(Options2)).

t_use_index_strict_disabled_requested(_) ->
meck:expect(config, get_boolean, ["mango", "strict_index_selection", '_'], meck:val(false)),
Options = [{use_index_strict, true}],
?assertEqual(true, use_index_strict(Options)).

t_use_index_strict_enabled_not_requested(_) ->
meck:expect(config, get_boolean, ["mango", "strict_index_selection", '_'], meck:val(true)),
Options1 = [{use_index_strict, false}],
?assertEqual(false, use_index_strict(Options1)),
Options2 = [],
?assertEqual(true, use_index_strict(Options2)).

t_use_index_strict_enabled_requested(_) ->
meck:expect(config, get_boolean, ["mango", "strict_index_selection", '_'], meck:val(true)),
Options = [{use_index_strict, true}],
?assertEqual(true, use_index_strict(Options)).
-endif.
22 changes: 22 additions & 0 deletions src/mango/src/mango_error.erl
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,28 @@ info(mango_json_bookmark, {invalid_bookmark, BadBookmark}) ->
<<"invalid_bookmark">>,
fmt("Invalid bookmark value: ~s", [?JSON_ENCODE(BadBookmark)])
};
info(mango_cursor, {invalid_index, []}) ->
{
400,
<<"invalid_index">>,
<<"You must specify an index with the `use_index` parameter.">>
};
info(mango_cursor, {invalid_index, [DDocName]}) ->
{
400,
<<"invalid_index">>,
fmt("_design/~s specified by `use_index` could not be found or it is not suitable.", [
DDocName
])
};
info(mango_cursor, {invalid_index, [DDocName, ViewName]}) ->
{
400,
<<"invalid_index">>,
fmt("_design/~s, ~s specified by `use_index` could not be found or it is not suitable.", [
DDocName, ViewName
])
};
info(mango_cursor_text, {invalid_bookmark, BadBookmark}) ->
{
400,
Expand Down
6 changes: 6 additions & 0 deletions src/mango/src/mango_opts.erl
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,12 @@ validate_find({Props}) ->
{default, []},
{validator, fun validate_use_index/1}
]},
{<<"use_index_strict">>, [
{tag, use_index_strict},
{optional, true},
{default, false},
{validator, fun mango_opts:is_boolean/1}
]},
{<<"bookmark">>, [
{tag, bookmark},
{optional, true},
Expand Down
1 change: 1 addition & 0 deletions src/mango/test/02-basic-find-test.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,7 @@ def test_explain_options(self):
assert opts["stale"] == False
assert opts["update"] == True
assert opts["use_index"] == []
assert opts["use_index_strict"] == False

def test_sort_with_all_docs(self):
explain = self.db.find(
Expand Down
19 changes: 19 additions & 0 deletions src/mango/test/05-index-selection-test.py
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,25 @@ def test_explain_sort_reverse(self):
)
self.assertEqual(resp_explain["index"]["type"], "json")

def test_strict_index_selection(self):
with self.subTest(with_use_index=True):
try:
self.db.find(
{"manager": True}, use_index="invalid", use_index_strict=True
)
except Exception as e:
self.assertEqual(e.response.status_code, 400)
else:
raise AssertionError("did not fail on invalid index")

with self.subTest(with_use_index=False):
try:
self.db.find({"manager": True}, use_index_strict=True)
except Exception as e:
self.assertEqual(e.response.status_code, 400)
else:
raise AssertionError("did not fail due to missing use_index")


class JSONIndexSelectionTests(mango.UserDocsTests, IndexSelectionTests):
@classmethod
Expand Down
3 changes: 3 additions & 0 deletions src/mango/test/mango.py
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,7 @@ def find(
update=True,
executionStats=False,
partition=None,
use_index_strict=None,
):
body = {
"selector": selector,
Expand All @@ -267,6 +268,8 @@ def find(
body["update"] = False
if executionStats == True:
body["execution_stats"] = True
if use_index_strict is not None:
body["use_index_strict"] = use_index_strict
body = json.dumps(body)
if partition:
ppath = "_partition/{}/".format(partition)
Expand Down

0 comments on commit d6f915c

Please sign in to comment.