The wildcard
query is a low-level term-based query similar in nature to the
prefix
query, but it allows you to specify a pattern instead of just a prefix.
It uses the standard shell wildcards where ?
matches any character, and *
matches zero or more characters.
This query would match the documents containing W1F 7HW
and W2F 8HW
:
GET /my_index/address/_search
{
"query": {
"wildcard": {
"postcode": "W?F*HW" (1)
}
}
}
-
The
?
will match the1
and the2
, while the*
matches the space and the7
and8
.
Imagine now that you wanted to match all postcodes just in the W
area. A
prefix match would also include postcodes starting with WC
, and you would
have a similar problem with a wildcard match. We only want to match postcodes
that begin with a W
, followed by a number. The regexp
query allows you to
write these more complicated patterns:
GET /my_index/address/_search
{
"query": {
"regexp": {
"postcode": "W[0-9].+" (1)
}
}
}
-
The regular expression says that the term must begin with a
W
, followed by any number from 0 to 9, followed by 1 or more other characters.
The wildcard
and regexp
queries work in exactly the same way as the
prefix
query. They also have to scan the list of terms in the inverted
index to find all matching terms, and gather document IDs term-by-term. The
only difference between them and the prefix
query is that they support more
complex patterns.
This means that the same caveats apply. Running these queries on a field with
many unique terms can be very resource intensive indeed. Avoid using a
pattern that starts with a wildcard, eg "*foo"
or, as a regexp, ".*foo"
.
While prefix matching can be made more efficient by preparing your data at index time, wildcard and regular expression matching can only really be done at query time. These queries have their place, but should be used sparingly.
Caution
|
The For instance, let’s say that our This query would match: { "regexp": { "title": "br.*" }} But neither of these queries would match: { "regexp": { "title": "Qu.*" }} (1)
{ "regexp": { "title": "quick br*" }} (2)
|