Skip to content

Commit

Permalink
Update docs on limits
Browse files Browse the repository at this point in the history
  • Loading branch information
carsonip committed Aug 8, 2023
1 parent 6f740c6 commit e454ed5
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 18 deletions.
20 changes: 7 additions & 13 deletions dev_docs/trace_metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,19 @@ As transactions are observed by APM Server, it groups them according to various
attributes such as `service.name`, `transaction.name`, and `kubernetes.pod.name`.
The latency is then recorded in an [HDRHistogram](http://hdrhistogram.org/) for
that group. Transaction group latency histograms are periodically indexed (every
minute by default), with configurable precision (defaults to 2 significant figures).
minute by default), with a fixed precision of 2 significant figures.

To protect against memory exhaustion due to high-cardinality transaction names
(or other attributes), at any given time, APM Server places a limit on the number
of services tracked, the number of transaction groups tracked, as well as number
of groups tracked per service.

By default, the limits are 1,000 services per GB of memory, 5,000 transaction groups
per GB of memory. When transaction group latency histograms are indexed, the groups
are reset, enabling a different set of groups to be recorded.
The per-service limit is 10% of the global limit. For example, for a 2GB APM Server,
the limits are 2,000 services, 10,000 transaction groups, and for each service,
there can be a maximum of 1,000 unique transaction groups.
of groups tracked per service. See docs for limits.

## Service transaction metrics

Service transaction metrics are similar to Transaction metrics, but with fewer
dimensions. For example, `transaction.name` is no longer considered during aggregation.

A limit of 1,000 unique service transaction groups per GB of memory is enforced.
See docs for limits.

## Service destination metrics

Expand All @@ -43,15 +36,16 @@ from one service to another. This works much the same as transaction metrics
aggregation: span events describing an operation that involves another service
are grouped by the originating and target services, and the span latency is
accumulated. For these metrics we record only a count and sum, enabling calculation
of throughput and average latency. A default limit of 10,000 groups is
imposed.
of throughput and average latency.

See docs for limits.

## Service summary metrics

Service summary metrics consider transaction, error, log, and metric events and
basically produce a summary of all services sending events.

A limit of 1,000 unique service summary groups per GB of memory is enforced.
See docs for limits.

## Dealing with sampling

Expand Down
17 changes: 12 additions & 5 deletions docs/data-model.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -541,12 +541,19 @@ there are limits on the number of unique groups tracked at any given time.

Note that all the below limits may change in the future with further improvements.

* For transaction metrics, the limits are 1000 services per GB of APM Server, and 5000 transaction
groups per GB of APM Server. Additionally, each service may only consume up to 10% of the transaction groups,
* For all the following metrics, they share a limit of 1000 services per GB of APM Server.
** For transaction metrics, there is an additional limit of 5000 total transaction groups per GB of APM Server,
and each service may only consume up to 10% of the transaction groups,
which is 500 transaction groups per service per GB of APM Server.
* For service-transaction metrics, the limit is 1000 service transaction groups per GB of APM Server.
* For service-destination metrics, the limit is a constant of 10000 service destination groups.
* For service-summary metrics, the limit is 1000 service summary groups per GB of APM Server.
** For service-transaction metrics, there is an additional limit of 1000 total service transaction groups per GB of APM Server,
and each service may only consume up to 10% of the service transaction groups,
which is 100 service transaction groups per service per GB of APM Server.
** For service-destination metrics, there is an additional limit of a constant 10000 total service destination groups,
and each service may only consume up to 10% of the service destination groups,
which is 1000 service destination groups per service.
** For service-summary metrics, there is no additional limit.

In the above, a service is defined as a combination of `service.name`, `service.environment`, `service.language.name` and `agent.name`.

[float]
===== Overflows
Expand Down

0 comments on commit e454ed5

Please sign in to comment.