Skip to content

Commit

Permalink
Add single writer principle note to deployment documentation (#5166)
Browse files Browse the repository at this point in the history
Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com>
Co-authored-by: opentelemetrybot <107717825+opentelemetrybot@users.noreply.github.com>
Co-authored-by: Phillip Carter <pcarter@fastmail.com>
  • Loading branch information
5 people authored Jan 7, 2025
1 parent 333d350 commit 790dd33
Showing 1 changed file with 42 additions and 0 deletions.
42 changes: 42 additions & 0 deletions content/en/docs/collector/deployment/gateway/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -251,3 +251,45 @@ Cons:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/tailsamplingprocessor
[spanmetrics-connector]:
https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector

## Multiple collectors and the single-writer principle

All metric data streams within OTLP must have a
[single writer](/docs/specs/otel/metrics/data-model/#single-writer). When
deploying multiple collectors in a gateway configuration, it's important to
ensure that all metric data streams have a single writer and a globally unique
identity.

### Potential problems

Concurrent access from multiple applications that modify or report on the same
data can lead to data loss or degraded data quality. For example, you might see
inconsistent data from multiple sources on the same resource, where the
different sources can overwrite each other because the resource is not uniquely
identified.

There are patterns in the data that may provide some insight into whether this
is happening or not. For example, upon visual inspection, a series with
unexplained gaps or jumps in the same series may be a clue that multiple
collectors are sending the same samples. You might also see errors in your
backend. For example, with a Prometheus backend:

`Error on ingesting out-of-order samples`

This error could indicate that identical targets exist in two jobs, and the
order of the timestamps is incorrect. For example:

- Metric `M1` received at `T1` with a timestamp 13:56:04 with value `100`
- Metric `M1` received at `T2` with a timestamp 13:56:24 with value `120`
- Metric `M1` received at `T3` with a timestamp 13:56:04 with value `110`
- Metric `M1` received at time 13:56:24 with value `120`
- Metric `M1` received at time 13:56:04 with value `110`

### Best practices

- Use the
[Kubernetes attributes processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor)
to add labels to different Kubernetes resources.
- Use the
[resource detector processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/processor/resourcedetectionprocessor/README.md)
to detect resource information from the host and collect resource metadata.

0 comments on commit 790dd33

Please sign in to comment.