A comparison of ranking performance using different index configurations and sorting approaches.
Even with caching off (request cache and query cache off) a lot of the performance comes from the filesystem cache used by Elasticsearch. It's difficult to reliably test absolute performance differences offline. These benchmarks are indicative of relative performance between configurations. Suggesting directions to explore in a production setting.
These benchmarks run as a single-node cluster on a laptop with no other traffic. The shape and relative differences between configurations is more important than the absolute values in the timing measurements. The values on the yaxis, query + fetch time in milliseconds, gives some idea of the scale.
See configuration.py for benchmark configurations. Benchmark needs to be run at least twice to warm-up the filesystem cache.
docker-compose run --rm benchmarks bash
> python run.py --benchmark-name opendistro-versions
Comparing different versions of Open Distro ranking performance.
Sorting by a float
field.
Changing the fetch_size
.
Comparing rank feature queries.
Rank feature fields populated with random data from a lognormal distribution.
Changing the fetch_size
.
_doc
sort is the fastest (baseline).
Sorting by one float field is approximately the same as using a single rank feature.
Each additional rank_feature added to the should clause adds additional time.