-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LSM-based aggregation #11117
LSM-based aggregation #11117
Conversation
This pull request does not have a backport label. Could you fix it @carsonip? 🙏
NOTE: |
This pull request is now in conflicts. Could you fix it @carsonip? 🙏
|
Progress updateI've been working on understanding and optimizing the performance of LSM PoC. Performance improvements are WIP in apm-aggregation repo. But here's a glimpse of how LSM PoC fares against main:
The numbers are quite unstable since I ran the benchmarks on my laptop. But other than 1000TransactionsSerial (similar to 1000Transactions but runs in serial instead of in parallel), all the agent benchmarks seem to be fairly similar. |
This pull request is now in conflicts. Could you fix it @carsonip? 🙏
|
📚 Go benchmark reportDiff with the
report generated with https://pkg.go.dev/golang.org/x/perf/cmd/benchstat |
This pull request is now in conflicts. Could you fix it @carsonip? 🙏
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a few minor things.
changelogs/head.asciidoc
Outdated
==== Aggregation improvements | ||
- Replace aggregation implementation {pull}11117[11117] | ||
- Add `service.language.name` to service destination metrics {pull}11117[11117] | ||
- Modify per-service transaction groups limit to consider more than service.name; Add per-service service destination groups limit and per-service service transaction groups limit {pull}11117[11117] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simitt WDYT about the changelog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my 2c: I think we should improve the first message to briefly mention what it means for the user. At the very least we should mention the significantly reduced memory cost for aggregation by moving from in-memory to the on-disk LSM based approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to do so, but struggled to word it properly 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Tested and everything seems to be working. I spotted some issue with the way we are using the vt pool by not returning objects to the pool but it shouldn't be a regression since we were not even using the pool in older versions. It is something that we can iterate on. |
@kruskall do you mind leaving a note about where we did not return vt objects, or alternatively create an issue to do so? I'm very interested to know where we missed it. |
Yes, I plan to create followup issues an open PRs after making sure that they are correct |
Replace existing aggregation implementation with apm-aggregation, the LSM-based aggregator.
Motivation/summary
Replace existing aggregation with apm-aggregation, the LSM-based aggregator.
Tasks:
Storage configurationIn-memory FS used by pebble- [ ] aggregator Close should log if there's an error (x-pack processor stop error is not logged #11355)Checklist
(no user facing changes)- [ ] Update CHANGELOG.asciidoc- [ ] Update package changelog.yml (only if changes toapmpackage
have been made)- [ ] Documentation has been updatedFor functional changes, consider:
How to test these changes
Related issues