Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEBUG-2334 dynamic instrumentation probe notification builder #4011

Merged
merged 11 commits into from
Oct 29, 2024

Conversation

p-datadog
Copy link
Contributor

@p-datadog p-datadog commented Oct 17, 2024

Change log entry

None, DI is not customer-visible at this time.

What does this PR do?

This PR adds a component to create status and snapshot payloads for probes.

Motivation:

Initial DI implementation in Ruby.

Additional Notes:

Most of the test coverage is in integration tests for DI overall. This PR has minimal test coverage.

Correctness of generated payloads is meant to ultimately be verified by system tests.

How to test the change?

Unsure? Have a question? Request a review!

@p-datadog p-datadog requested a review from a team as a code owner October 17, 2024 16:31
@p-datadog p-datadog force-pushed the di-probe-notification-builder branch from 3d5055d to 6f847de Compare October 17, 2024 16:47
@codecov-commenter
Copy link

codecov-commenter commented Oct 17, 2024

Codecov Report

Attention: Patch coverage is 96.13260% with 7 lines in your changes missing coverage. Please review.

Project coverage is 97.84%. Comparing base (321f513) to head (0080237).
Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
lib/datadog/di/probe_notification_builder.rb 92.10% 6 Missing ⚠️
.../di/integration/probe_notification_builder_spec.rb 98.07% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4011      +/-   ##
==========================================
- Coverage   97.86%   97.84%   -0.02%     
==========================================
  Files        1321     1324       +3     
  Lines       79330    79511     +181     
  Branches     3934     3956      +22     
==========================================
+ Hits        77635    77797     +162     
- Misses       1695     1714      +19     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@p-datadog p-datadog force-pushed the di-probe-notification-builder branch from 6f847de to 1173c5f Compare October 17, 2024 17:13
@pr-commenter
Copy link

pr-commenter bot commented Oct 17, 2024

Benchmarks

Benchmark execution time: 2024-10-25 18:02:49

Comparing candidate commit 0080237 in PR branch di-probe-notification-builder with baseline commit 321f513 in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 23 metrics, 2 unstable metrics.

scenario:profiler - sample timeline=false

  • 🟩 throughput [+0.525op/s; +0.537op/s] or [+8.413%; +8.599%]

@p-datadog p-datadog force-pushed the di-probe-notification-builder branch from 1173c5f to 659bec1 Compare October 17, 2024 17:44
@p-datadog p-datadog force-pushed the di-probe-notification-builder branch 2 times, most recently from 7e9ced8 to 382ad93 Compare October 17, 2024 18:06
This component creates status and snapshot payloads for probes.
@p-datadog p-datadog force-pushed the di-probe-notification-builder branch from 382ad93 to b33be73 Compare October 17, 2024 18:08
Comment on lines 205 to 209
def thread_id
thread = Thread.current
if thread.respond_to?(:native_thread_id)
# Ruby 3.1+
thread.native_thread_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be careful with using native_thread_id here -- in Ruby 3.3+ ruby threads can move across OS threads (see M:N announcement). This behavior change is off by default in the main ractor but that may change...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't use it myself for anything and I made a note to inquire with my team what it is supposed to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed thread_id sending because currently DI does not need it.

Comment on lines 137 to 138
"dd.trace_id": 136035165280417366521542318182735500431,
"dd.span_id": 17576285113343575026,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look like placeholders -- should there be a FIXME or something here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I forgot about these completely - they probably need to be filled out somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, I can do that in a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repaired.

@p-datadog p-datadog requested a review from a team as a code owner October 18, 2024 16:14
@p-datadog p-datadog added the dev/internal Other internal work that does not need to be included in the changelog label Oct 18, 2024
Comment on lines 166 to 171
callers.map do |caller|
if caller =~ /\A([^:]+):(\d+):in `([^']+)'\z/
{
fileName: $1, function: $3, lineNumber: Integer($2),
}
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible to avoid having to parse stack trace strings -- Ruby has two sets of APIs for stacktraces (at the Ruby level...): You can use backtrace and backtrace_locations (or caller/caller_locations). The _locations version of it gives you an object that you can query for the individual parts, no need for parsing strings to split it back out ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thank you for the pointer.

Copy link
Member

@ivoanjo ivoanjo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick pass and left a few notes :)

Comment on lines +88 to +95
# Normally caller_locations should always be filled for a line probe
# but in the test suite we don't always provide all arguments.
actual_file_basename = File.basename(probe.file)
caller_locations&.detect do |loc|
# TODO record actual path that probe was installed into,
# perform exact match here against that path.
File.basename(loc.path) == actual_file_basename
end&.path || probe.file
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a slightly expensive thing to do for just the tests 🤔. Or can there be situations in user code where this gets triggered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the path to the file in the payload, but the path should be determined during instrumentation as per the comment just above. I've hacked path matching and processing by doing basename comparisons but this isn't sufficient, for v1 I will have the actual path stored in the probe after instrumentation and nothing should be calculated here.

Comment on lines 131 to 145
duration: duration ? (duration * 10**9).to_i : nil,
host: nil,
logger: {
name: probe.file,
method: probe.method_name || 'no_method',
thread_name: Thread.current.name,
thread_id: thread_id,
version: 2,
},
# TODO add tests that the trace/span id is correctly propagated
"dd.trace_id": Datadog::Tracing.active_trace&.id,
"dd.span_id": Datadog::Tracing.active_span&.id,
ddsource: 'dd_debugger',
message: probe.template && evaluate_template(probe.template,
duration: duration ? duration * 1000 : nil),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: It may be worth encoding the units of duration somewhere in the variable names (duration_ms?) or using an explicit type, since units are getting mixed and converted a bit and it seems easy to make a mistake...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention I use is when units are not specified, time is measured in seconds. In this particular case, DI specifies that duration in the template should be measured in milliseconds, but the variable is called duration. Thus I am not sure how to add a suffix here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the key itself in the data is duration, maybe our variable can be duration_s or duration_secs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment stating the units of duration for the method parameter. I personally have the convention that time units without an explicitly stated unit are in seconds, which I believe is how everything in Ruby standard library works, and to me adding a suffix for seconds is redundant. But if you want a suffix let's go with whichever one you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree with @ivoanjo regarding naming with postfix values with different scale. Comments are nice, but code is better. Proper name will prevent from storing a wrong value scale in it, comment - not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #4028 I would like to leave the time units alone until there is team consensus on how they should be indicated and all of the other pending DI code is merged.

Comment on lines +175 to +178
message = template.dup
vars.each do |key, value|
message.gsub!("{@#{key}}", value.to_s)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Not sure if worth it, but occurred to me that we could turn the template into an input for sprintf or String#% and thus we'd be able to template all vars at one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the message, there is actually an AST that is supposed to be evaluated which contains complex expressions. This evaluation will not be implemented for initial DI implementation. I have the above substitution as a "quick demo" of variable evaluation, but this entire implementation is temporary.

I made a note to revisit this part of the code but I think there won't be a large gain realized here with % because it'll still require some parsing or massaging and a straightforward string replacement should be very fast (and users should only have a couple of variables at most in their log message, normally).


def build_executed: (Probe probe, ?trace_point: untyped?, ?rv: untyped?, ?duration: untyped?, ?callers: untyped?, ?args: untyped?, ?kwargs: untyped?, ?serialized_entry_args: untyped?) -> untyped

def build_snapshot: (Probe probe, ?rv: untyped?, ?snapshot: untyped?, ?duration: untyped?, ?callers: untyped?, ?args: untyped?, ?kwargs: untyped?, ?serialized_entry_args: untyped?) -> { service: untyped, :"debugger.snapshot" => { id: untyped, timestamp: untyped, evaluationErrors: ::Array[untyped], probe: { id: untyped, version: 0, location: untyped }, language: "ruby", stack: untyped, captures: untyped }, duration: untyped, host: nil, logger: { name: untyped, method: untyped, thread_name: untyped, thread_id: untyped, version: 2 }, :"dd.trace_id" => 136035165280417366521542318182735500431, :"dd.span_id" => 17576285113343575026, ddsource: "dd_debugger", message: untyped, timestamp: untyped }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This return looks quite hard to maintain as-is. Perhaps worth either generalizing (into just Hash) or introducing a class for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repaired.

@p-datadog p-datadog merged commit 5ed1f0f into master Oct 29, 2024
270 checks passed
@p-datadog p-datadog deleted the di-probe-notification-builder branch October 29, 2024 12:52
@github-actions github-actions bot added this to the 2.5.0 milestone Oct 29, 2024
p-datadog pushed a commit to p-datadog/dd-trace-rb that referenced this pull request Oct 29, 2024
* master:
  DEBUG-2334 Probe Notifier Worker component (DataDog#4028)
  DEBUG-2334 dynamic instrumentation probe notification builder (DataDog#4011)
  Handle low-level libddwaf exception in Context
  [NO-TICKET] Minor: Fix typos in safe_dup_spec.rb
  Remove libdatadog musl
  Remove ffi's  after installation
  Remove cached gems
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev/internal Other internal work that does not need to be included in the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants