Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overflows should emit logs and metrics #75

Closed
carsonip opened this issue Aug 8, 2023 · 3 comments
Closed

Overflows should emit logs and metrics #75

carsonip opened this issue Aug 8, 2023 · 3 comments
Assignees

Comments

@carsonip
Copy link
Member

carsonip commented Aug 8, 2023

Related to elastic/apm-server#11362 and elastic/apm-server#11117

  • Overflows should emit logs and metrics such that overflows are observable.
  • Create a dashboard to observe overflows.
@carsonip
Copy link
Member Author

In apm-server for txmetrics we had:

	totalOverflow := metrics.servicesOverflow + metrics.perSvcTxnGroupsOverflow + metrics.txnGroupsOverflow
	monitoring.ReportInt(V, "active_groups", metrics.activeGroups)
	monitoring.ReportNamespace(V, "overflowed", func() {
		monitoring.ReportInt(V, "services", metrics.servicesOverflow)
		monitoring.ReportInt(V, "per_service_txn_groups", metrics.perSvcTxnGroupsOverflow)
		monitoring.ReportInt(V, "txn_groups", metrics.txnGroupsOverflow)
		monitoring.ReportInt(V, "total", totalOverflow)
	})

@axw do you think #86 provides sufficient insight into overflows? It does seem that it lacks the granularity that we had in the past, but I'm not sure if that's useful. Given that the next step will be to build dashboards on top of the new metrics in #86 , just wanted to confirm we are happy with what we have now after #86.

@axw
Copy link
Member

axw commented Aug 15, 2023

I think we should go with what we have, and add to it as needed. It may not be complete, but I think what's there is correct.

Regarding what we used to have:

  • I think it should be possible to measure active_groups from the indexed metric documents
  • overflowed.total doesn't seem useful if we know the number of metric-type overflows
  • overflowed.per_service_txn_groups doesn't seem useful either, since it isn't per-service (and I think we probably shouldn't make it per-service, as it could lead to way too many metrics)

@carsonip
Copy link
Member Author

Closing as all work on the topic has been completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants