-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent unnecessary heap sort when buckets needs to be ordered by key… #17252
Conversation
@msfroh Please let me know if this change looks legit to you, thank you |
❌ Gradle check result for b007591: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
… in NumericTermsAggregation Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com>
b007591
to
6141759
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #17252 +/- ##
============================================
- Coverage 72.43% 72.29% -0.14%
+ Complexity 65725 65710 -15
============================================
Files 5318 5318
Lines 305675 305680 +5
Branches 44350 44352 +2
============================================
- Hits 221408 220999 -409
- Misses 66055 66597 +542
+ Partials 18212 18084 -128 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow... That's subtle. Replacing the O(log n) pop
with the O(1) next
is a good find (for sufficiently large n)!
@msfroh looks like you already approved, if you do not have any concerns, we can merge it. |
… in NumericTermsAggregation (opensearch-project#17252) Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> (cherry picked from commit 3f793b6)
… in NumericTermsAggregation (#17252) (#17290) (cherry picked from commit 3f793b6) Signed-off-by: Rishabh Maurya <rishabhmaurya05@gmail.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Signed-off-by: Ankit Jain <akjain@amazon.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Ankit Jain <akjain@amazon.com>
We can slightly improve performance of Numeric term Aggs by avoiding double sorting - first by count and then by key before final reduce
Description
We don't need to sort the buckets by
_count
. PriorityQueue will ensure topN buckets by_count
, as final reduce order is always by key in NumericTermsAggregation andResultStrategy#buildResult()
is anyways sorting the buckets by key before final reduce, thus sorting the buckets by count or whatever the order clause was, is redundant.Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
- [ ] API changes companion pull request created, if applicable.- [ ] Public documentation issue/PR created, if applicable.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.