-
Hello, I am confused by the following behavior and I need help because I think I am doing something wrong. Let's say I have 100k documents in the database. The structure of a document is: {
"_id": "...",
"_rev": "...",
...
"data": {
"device":"..."
...
}
} I want to find all documents where the data.device is ARCH. So I used query: {
"execution_stats": true,
"limit": 100000,
"fields": [
"_id"
],
"selector": {
"data.device": {
"$eq": "ARCH"
}
}
} If there is no index created, it takes 14 seconds to complete the query with the following execution stats: "execution_stats": {
"total_keys_examined": 0,
"total_docs_examined": 100005,
"total_quorum_docs_examined": 0,
"results_returned": 88880,
"execution_time_ms": 14044.458
},
"warning": "No matching index found, create an index to optimize query time." So I follow the warning and create index of the device.data field: {
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
"data.device": "asc"
}
]
}
} Now when I run the query, it takes about 1 minute to complete with the following execution stats: "execution_stats": {
"total_keys_examined": 0,
"total_docs_examined": 88880,
"total_quorum_docs_examined": 0,
"results_returned": 88880,
"execution_time_ms": 58923.922
} At least but not last I add _explain results: {
"dbname": "xxx",
"index": {
"ddoc": null,
"name": "_all_docs",
"type": "special",
"def": {
"fields": [
{
"_id": "asc"
}
]
}
},
"partitioned": "undefined",
"selector": {
"data.device": {
"$eq": "ARCH"
}
},
"opts": {
"use_index": [],
"bookmark": "nil",
"limit": 100000,
"skip": 0,
"sort": {},
"fields": [
"_id"
],
"partition": "",
"r": [
49
],
"conflicts": false,
"stale": false,
"update": true,
"stable": false,
"execution_stats": true
},
"limit": 100000,
"skip": 0,
"fields": [
"_id"
],
"mrargs": {
"include_docs": true,
"view_type": "map",
"reduce": false,
"partition": null,
"start_key": null,
"end_key": "<MAX>",
"direction": "fwd",
"stable": false,
"update": true,
"conflicts": "undefined"
}
} {
"dbname": "asset",
"index": {
"ddoc": "_design/f65b3aa85a587fa4d9c3dacd4bb89208de348776",
"name": "data-device",
"type": "json",
"partitioned": false,
"def": {
"fields": [
{
"data.device": "asc"
}
]
}
},
"partitioned": false,
"selector": {
"data.device": {
"$eq": "ARCH"
}
},
"opts": {
"use_index": [],
"bookmark": "nil",
"limit": 100000,
"skip": 0,
"sort": {},
"fields": [
"_id"
],
"partition": "",
"r": [
49
],
"conflicts": false,
"stale": false,
"update": true,
"stable": false,
"execution_stats": true
},
"limit": 100000,
"skip": 0,
"fields": [
"_id"
],
"mrargs": {
"include_docs": true,
"view_type": "map",
"reduce": false,
"partition": null,
"start_key": [
"ARCH"
],
"end_key": [
"ARCH",
"<MAX>"
],
"direction": "fwd",
"stable": false,
"update": true,
"conflicts": "undefined"
}
} |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Hi, this is a case where the warning is not worth heeding. An important detail: the index you create in Mango is not a covering index. It will only return the _ids of the documents matching your selector; the server still needs to go and retrieve those document bodies for you. In contrast, when Mango operates without a secondary index it does have direct access to the bodies of the matching documents. In this case, your selector is matching 88% of the documents that it examined in the database without any index. That's a very good hit rate. You should consider if this was just due to a particular choice of document ID that might change in the future, or if this query is actually never going to be very selective. If it's not selective, don't add an index, as the extra level of indirection to retrieve all those documents overwhelms the modest advantage of an index matching 100% of the docs of interest. CouchDB could improve the user experience by sending a warning only when the ratio of documents selected to documents scanned falls below a certain threshold. We already do this when a query uses a secondary index with a large number of rows that don't match the selector. |
Beta Was this translation helpful? Give feedback.
Hi, this is a case where the warning is not worth heeding.
An important detail: the index you create in Mango is not a covering index. It will only return the _ids of the documents matching your selector; the server still needs to go and retrieve those document bodies for you. In contrast, when Mango operates without a secondary index it does have direct access to the bodies of the matching documents.
In this case, your selector is matching 88% of the documents that it examined in the database without any index. That's a very good hit rate. You should consider if this was just due to a particular choice of document ID that might change in the future, or if this query is actually never goi…