Skip to content

Conversation

@imran-at-datadog
Copy link
Contributor

@imran-at-datadog imran-at-datadog commented Sep 26, 2025

Description

MLOB-3969 Add user metadata to root span tags
MLOB-3980 Tag the entry point on the root span

Testing

pip install .
export DD_SERVICE="imran-ray-metadata-test-014"
RAY_LOGGING_CONFIG_ENCODING=JSON DD_ENV=dev ray start --head --dashboard-host=127.0.0.1 --tracing-startup-hook=ddtrace.contrib.ray:setup_tracing
ray job submit --metadata-json='{"job_name": "train_my_model", "test":"1"}' --submission-id="imran-ray-metadata-test-014" -- python /Users/imran.hendley/go/src/github.com/DataDog/dd-trace-py/tests/contrib/ray/jobs/simple_task.py arg1 --arg2=value2
Screenshot 2025-10-01 at 4 45 38 PM

And running again with DD_RAY_REDACT_ENTRYPOINT_PATHS set to false results in an unredacted path in the entrypoint:

Screenshot 2025-10-01 at 4 50 23 PM

Risks

None

Additional Notes

Do we need to support recreating these tags in RaySpanManager._recreate_job_span?

  • ANSWER: No, the span with tags is copied already.

dubloom and others added 30 commits August 25, 2025 18:51
Right now [the DJM intake expects Ray
spans](https://github.com/DataDog/logs-backend/blob/79793e12095e033e3998ff6318416c5db0507907/domains/apm/apps/apm-processing/src/main/java/com/dd/logs/processing/processors/track/spans/JobSpansProcessor.java#L28)
to have span type `producer` or `consumer`. It used to be `ray.producer`
or `ray.consumer`, but after discussing last week we agreed to remove
the `ray.` prefix to more closely match the spans produced by Ray's
OpenTelemetry instrumentation. Our Ray integration [currently produces
spans of three
types](https://dd.datad0g.com/internal/events-ui/queries?group_by=type&index_name=djm-search&query_string=%40component%3Aray&query_type=aggregate&timerange=1755708134662-1756312934662l&track=trace):
`serving`, `worker`, and `ml`. In this PR I am making it replace
`serving` with `producer`, and `worker` and `ml` with `consumer` for
now, just so the DJM intake recognizes that it needs to pick them up.

For testing, I [opened this file in my local
dd-source](https://github.com/DataDog/dd-source/blob/d67d0dd42507de7ab369761afa1b15e4652bed20/domains/data_science/apps/ray-cluster/image/aip-practice/aip-tracing/Dockerfile#L17)
and replaced `dubloom/ray-integration` with
`yakov.shapiro/MLOB-3768/update-span-type`, the name of this branch. I
then followed [the steps from this comment on
MLOB-3676](https://datadoghq.atlassian.net/browse/MLOB-3676?focusedCommentId=2568529).
I verified that the type on the resulting spans [is now set to
ray](https://dd.datad0g.com/internal/events-ui/queries?group_by=job_name&index_name=djm-search&query_string=%40component%3Aray&query_type=list&timerange=1756404208851-1756418608851&track=trace).

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
## Overview
The change allows to capture host name which in conjunction with process
ID will provide GPU utilization information.


## Checklist
- [ ] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [ ] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
Base automatically changed from dubloom/ray-v0 to main October 2, 2025 07:29
@dubloom dubloom requested review from a team as code owners October 2, 2025 07:29
Copy link
Contributor

@dubloom dubloom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits but we are almost good to go.

Copy link
Contributor

@dubloom dubloom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing all my comments !

@dubloom dubloom enabled auto-merge (squash) October 3, 2025 09:27
@dubloom dubloom merged commit d807ee8 into main Oct 3, 2025
437 checks passed
@dubloom dubloom deleted the imran-hendley/ray-root-span-metadata branch October 3, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/no-changelog A changelog entry is not required for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants