Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for tagging profiles with opentelemetry trace identifiers #1568

Closed
wants to merge 10 commits into from

Conversation

ivoanjo
Copy link
Member

@ivoanjo ivoanjo commented Jun 25, 2021

This PR adds support for gathering the trace identifiers from ongoing traces being done with the opentelemetry gems.

Priority is still given to ddtrace traces, if available. I've refactored this code out of the stack collector, so now it's easier to control the dependency between profiler and tracer, as well as to have the additional opentelemetry support.

As I was about to open this PR, the interface used to access the current trace was refactored and part of it was marked private in the opentelemetry-api 1.0.0.rc2 release. I plan to follow-up with them and ask if it's possible to make this public again (open-telemetry/opentelemetry-ruby#842). For now, we still work on that version, but use the private interface.

@ivoanjo ivoanjo requested a review from a team June 25, 2021 10:22
ivoanjo added a commit to DataDog/opentelemetry-ruby that referenced this pull request Jun 25, 2021
I've been working on integrating Datadog's continuous profiler with
opentelemetry traces (see DataDog/dd-trace-rb#1568).

The profiler runs on a background thread, and needs to be able to
access the current context for a thread, to be able to get the
current span's trace id and span ids (if any active).

To do so, I was originally using

```ruby
thread = ...
::OpenTelemetry::Trace.current_span(thread[::OpenTelemetry::Context::KEY])
```

Unfortunately, after open-telemetry#807, this interface was changed, and more
importantly, `Context::KEY` was removed and replaced with
`Context::STACK_KEY`. `STACK_KEY` was marked as `private_constant`.

With 1.0.0.rc2, the only way of getting this information is by
relying on private implementation details, which isn't great.

Thus, I would like to ask if it'd be reasonable to add an optional
`thread` parameter to `Context.current`. This would make it easy to
access the needed information, and it would even be more future-proof
as the outside code doesn't need to care anymore where and how
the context is stored.
ivoanjo added a commit to DataDog/opentelemetry-ruby that referenced this pull request Jun 25, 2021
I've been working on integrating Datadog's continuous profiler with
opentelemetry traces (see DataDog/dd-trace-rb#1568).

The profiler runs on a background thread, and needs to be able to
access the current context for a thread, to be able to get the
current span's trace id and span ids (if any active).

To do so, I was originally using

```ruby
thread = ...
::OpenTelemetry::Trace.current_span(thread[::OpenTelemetry::Context::KEY])
```

Unfortunately, after open-telemetry#807, this interface was changed, and more
importantly, `Context::KEY` was removed and replaced with
`Context::STACK_KEY`. `STACK_KEY` was marked as `private_constant`.

With 1.0.0.rc2, the only way of getting this information is by
relying on private implementation details, which isn't great.

Thus, I would like to ask if it'd be reasonable to add an optional
`thread` parameter to `Context.current`. This would make it easy to
access the needed information, and it would even be more future-proof
as the outside code doesn't need to care anymore where and how
the context is stored.
tracer = Datadog.tracer
return unless tracer.respond_to?(:active_correlation)

@tracer = tracer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Storing the tracer can possibly hold the wrong instance of Datadog.tracer when Datadog.configure is called, as we rebuild components at that time, creating a brand new tracer instance at Datadog.tracer.

If this class is recreated alongside the profiler on Datadog.configure calls, then all is likely well if #trace_identifiers_for cannot be invoked before the new tracer is created during Datadog.configure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. This class is indeed re-created with the profiler (in helper.rb), but with the whole thing about the components lock doesn't guarantee that the new tracer will be available in Datadog.tracer before the profiler starts sampling. I'll tweak this a bit to avoid this issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in f6711e2 . Let me know your thoughts, as I know this is usually not the way things are done in ddtrace ;)

This refactor will enable us to support getting the trace identifiers
from more than just datadog traces without adding yet more logic to the
stack collector.
We are now able to gather the trace identifiers from ongoing traces
being done with the opentelemetry gems.

Priority is still given to ddtrace traces, and we also take care to
impact users that do not use opentelemetry.

NOTE: I'm somewhat unsure about adding `opentelemetry-api` to the
      `Gemfile` vs adding it as an appraisal.
      All our current appraisals seem to be for tracing integrations,
      which is not the case, and we do already have an "opentelemetry"
      rake task that existed so it may be confusing to have it changed
      into an appraisal (or to have two) so I took the simplest route.
      Suggestions welcome!
This avoids surprises if customers try to use an older version.
In particular open-telemetry/opentelemetry-ruby#807
changed some of the APIs we were using to get the current span for a
thread.

To test both 0.17.0 and this new version I moved
opentelemetry to appraisals.
As discussed during PR review of #1568, it's problematic to just cache
whatever's in `Datadog.tracer` because we can observe the old value
during component initialization.

To break the loop, let's instead directly initialize the
`TraceIdentifiers::Ddtrace` class with the correct tracer during
component initialization.

IMHO this has a further advantage: it makes it really explict where
there is a tracer-to-profiler dependency whereas previously there was
just a call to `Datadog.tracer` deep in the bowels of the profiler that
could be called at any point.
@ivoanjo ivoanjo force-pushed the ivoanjo/profiler-opentelemetry-support branch from d185617 to f6711e2 Compare June 28, 2021 11:11
@ivoanjo
Copy link
Member Author

ivoanjo commented Jun 28, 2021

P.s.: I did a rebase/force push to get the effects of #1569 in this PR (hiding generated gemfiles).

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's great to have a concrete tracer instance passes around, instead of referencing Datadog.tracer when possible, and given the profiler+tracer are both initialized by us, it makes a lot of sense to have this direct reference.

@ivoanjo
Copy link
Member Author

ivoanjo commented Jun 30, 2021

I've marked this as draft, as there's some ongoing discussion on open-telemetry/opentelemetry-ruby#842 that may end up factoring in how we support opentelemetry, so I'd like to hold on merging this until there's a clearer path forward.

@ivoanjo ivoanjo marked this pull request as draft June 30, 2021 07:44
ivoanjo added a commit that referenced this pull request Jul 13, 2021
As discussed during PR review of #1568, it's problematic to just cache
whatever's in `Datadog.tracer` because we can observe the old value
during component initialization.

To break the loop, let's instead directly initialize the
`TraceIdentifiers::Ddtrace` class with the correct tracer during
component initialization.

IMHO this has a further advantage: it makes it really explict where
there is a tracer-to-profiler dependency whereas previously there was
just a call to `Datadog.tracer` deep in the bowels of the profiler that
could be called at any point.
@ivoanjo ivoanjo self-assigned this Jul 13, 2021
@ivoanjo
Copy link
Member Author

ivoanjo commented Jul 14, 2021

TODO: This needs to be rebased on top of #1591 (which is now on master).

@ivoanjo ivoanjo marked this pull request as ready for review November 1, 2021 08:18
@ivoanjo
Copy link
Member Author

ivoanjo commented Nov 1, 2021

Unfortunately there hasn't been a lot of movement by upstream opentelemetry, so the team decided to close this. Since we merged #1591 anyway (which enables pluggable trace identifiers) we should be in a good shape to revisit this at any point in the future.

@ivoanjo ivoanjo closed this Nov 1, 2021
@GustavoCaso GustavoCaso deleted the ivoanjo/profiler-opentelemetry-support branch October 11, 2023 11:39
@ivoanjo
Copy link
Member Author

ivoanjo commented Jul 9, 2024

See also #3510

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants