-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datadog spans explosion after v1.53.x #6355
Comments
We are experiencing this issue too. We have had to disable Datadog trace export entirely. |
Hi, The background to this is that Datadog has an unusual way of dealing with traces requiring them to be sent to the agent even if they are not sampled. The PR should allow users to have this behaviour while also passing the sampling priority downstream correctly. |
@BrynCooke Thank you, that's great to know. We won't be able to test this until January due to holidays and PTO but can take a look then and report back. Would it be helpful to try to put you in contact with our Datadog rep? |
@davidegreenwald Thank you for the offer but I'm not sure it would help. Longer term it would be great if Datadog promoted the Otel standard and made this their preferred ingestion mechanism, but I'm not sure it would be in their commercial interests. The alternative has been that we reverse engineer the Datadog protocols and behaviour to try and make everything play nicely. We're going to try and upstream some of the work that was done in this PR, in particular the way that sampling is handled and PSR is propagated. |
@BrynCooke Thank you! |
Describe the bug
We use the Inigo build of the router and recently upgraded from v0.30.11 to 0.30.15 (bringing us from router 1.53.x to 1.57.1).
The Datadog changes across these versions appear to have exploded our span count in Datadog. The router appears to be ignoring the sampling rates on its upstream, parent services which it was previously respecting and is now ingesting all spans. This has lifted our ingest by 100x and will have a cost impact on our Datadog contract.
We're using the Datadog trace exporter: https://www.apollographql.com/docs/graphos/reference/router/telemetry/trace-exporters/datadog
Expected behavior
There should be no change in trace levels from these upgrades.
Additional context
Happy to give you any further information we can here.
The text was updated successfully, but these errors were encountered: