[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 #5417

xiaolei373 · 2025-12-07T15:47:10Z

Motivation

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2025-12-07T15:47:16Z

Thanks for your contribution!

codecov-commenter · 2025-12-07T17:02:36Z

Codecov Report

❌ Patch coverage is 66.15679% with 177 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@c3a8a16). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/metrics/trace.py	71.39%	87 Missing and 32 partials ⚠️
fastdeploy/engine/common_engine.py	6.25%	15 Missing ⚠️
fastdeploy/entrypoints/openai/api_server.py	53.84%	12 Missing ⚠️
fastdeploy/entrypoints/openai/serving_chat.py	40.00%	10 Missing and 2 partials ⚠️
...astdeploy/entrypoints/openai/serving_completion.py	40.00%	10 Missing and 2 partials ⚠️
fastdeploy/output/token_processor.py	63.63%	4 Missing ⚠️
fastdeploy/engine/request.py	33.33%	2 Missing ⚠️
fastdeploy/entrypoints/cli/tokenizer.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #5417   +/-   ##
==========================================
  Coverage           ?   59.25%           
==========================================
  Files              ?      327           
  Lines              ?    40915           
  Branches           ?     6225           
==========================================
  Hits               ?    24243           
  Misses             ?    14788           
  Partials           ?     1884

Flag	Coverage Δ
GPU	`59.25% <66.15%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot

Pull request overview

This PR introduces fine-grained distributed tracing for FastDeploy using OpenTelemetry to track request latency across different stages (preprocessing, scheduling, prefill, decode, postprocessing). This is Part 1 of the tracing implementation.

Key Changes:

Implemented comprehensive OpenTelemetry-based tracing infrastructure with span context propagation
Added tracing integration points across API server, scheduler, and token processor
Provided documentation and example configurations for Jaeger/OTel Collector setup

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
`fastdeploy/metrics/trace.py`	New comprehensive tracing implementation with span management and context propagation
`tests/metrics/test_trace.py`	Extensive test coverage for tracing functionality
`fastdeploy/entrypoints/openai/api_server.py`	Integrated tracing initialization and span decorators
`fastdeploy/entrypoints/openai/serving_chat.py`	Added request tracing start/finish and postprocessing spans
`fastdeploy/entrypoints/openai/serving_completion.py`	Added request tracing start/finish and postprocessing spans
`fastdeploy/entrypoints/engine_client.py`	Added preprocessing span tracking
`fastdeploy/engine/common_engine.py`	Added scheduler span tracking and context propagation
`fastdeploy/output/token_processor.py`	Added prefill/decode span tracking
`fastdeploy/engine/request.py`	Added trace_carrier field to RequestOutput
`fastdeploy/envs.py`	Added OTLP exporter configuration variables
`docs/zh/observability/trace.md`	Chinese documentation for tracing features
`docs/observability/trace.md`	English documentation for tracing features
`examples/observability/`	Docker Compose examples for Prometheus, Grafana, Jaeger, and OTel Collector

Copilot · 2025-12-08T03:44:06Z

fastdeploy/metrics/trace.py

+        }
+    )
+
+    # 统一填充 reqs_context 的 Root Span 信息


There's a Chinese comment in the code: # 初始化用于存储 Upstream Context的变量. All comments in the codebase should be in English for consistency and maintainability. Please translate this and other Chinese comments (lines 437, 480, 483, 501, 509, 513, 530) to English.

Suggested change

# 统一填充 reqs_context 的 Root Span 信息

# Consistently populate the Root Span information in reqs_context

Copilot · 2025-12-08T03:44:06Z

tests/ci_use/EB_VL_Lite/test_EB_VL_Lite_serving.py

+    # with open(log_path, "w") as logfile:
+    with open(log_path, "w"):
        process = subprocess.Popen(
            cmd,
-            stdout=logfile,
-            stderr=subprocess.STDOUT,
+            # stdout=logfile,
+            # stderr=subprocess.STDOUT,


The stdout and stderr redirects have been commented out without explanation. This means the subprocess output is not being captured to the log file, which could make debugging failures difficult. Either restore the redirects or add a comment explaining why they were removed. If this change is intentional for debugging, it should not be in the final PR.

Copilot · 2025-12-08T03:44:07Z

fastdeploy/output/token_processor.py

+                # trace_carrier = tracing.trace_get_proc_propagate_context(rid=rid)
+


Commented-out code should be removed. The line # trace_carrier = tracing.trace_get_proc_propagate_context(rid=rid) suggests incomplete implementation or debugging code that should either be removed or uncommented if it's needed.

Suggested change

# trace_carrier = tracing.trace_get_proc_propagate_context(rid=rid)

Copilot · 2025-12-08T03:44:07Z

fastdeploy/metrics/trace.py

@@ -0,0 +1,785 @@
+"""


The PR description is missing key information. According to the custom guidelines, the PR description should explain why these modifications are being made and what problem is being solved. The current description only lists checklist items. Please provide:

The motivation for adding fine-grained tracing

What problems this solves (e.g., debugging performance bottlenecks, request flow analysis)

An overview of the implementation approach

Copilot · 2025-12-08T03:44:07Z

fastdeploy/metrics/trace.py

+        self._processor.force_flush(timeout_millis)
+
+
+def lable_span(request):


Function name has a spelling error: 'lable_span' should be 'label_span'.

Suggested change

def lable_span(request):

def label_span(request):

Copilot · 2025-12-08T03:44:08Z

fastdeploy/engine/common_engine.py

-        for task in tasks:
-            start_span_request("DEQUEUE", task, trace.SpanKind.CONSUMER)
+        # for task in tasks:
+        #     start_span_request("DEQUEUE", task, trace.SpanKind.CONSUMER)


This comment appears to contain commented-out code.

Suggested change

# start_span_request("DEQUEUE", task, trace.SpanKind.CONSUMER)

Copilot · 2025-12-08T03:44:08Z

fastdeploy/entrypoints/openai/api_server.py

 from opentelemetry import trace
+from opentelemetry.propagate import extract

+import fastdeploy.metrics.trace as tracing


Module 'fastdeploy.metrics.trace' is imported with both 'import' and 'import from'.

Copilot · 2025-12-08T03:44:09Z

tests/metrics/test_trace.py

+import threading
+import time
+import unittest
+from unittest import mock


Module 'unittest' is imported with both 'import' and 'import from'.

Suggested change

from unittest import mock

Copilot · 2025-12-08T03:44:09Z

tests/metrics/test_trace.py

+        with mock.patch("fastdeploy.metrics.trace.logger"):
+            trace.process_tracing_init()
+            # Should log error but not crash
+            # Check if error was called (may not always be called depending on implementation)
+            pass


Unnecessary 'pass' statement.

Suggested change

with mock.patch("fastdeploy.metrics.trace.logger"):

trace.process_tracing_init()

# Should log error but not crash

# Check if error was called (may not always be called depending on implementation)

pass

with mock.patch("fastdeploy.metrics.trace.logger") as mock_logger:

trace.process_tracing_init()

# Should log error but not crash

# Check if error was called (may not always be called depending on implementation)

assert mock_logger.error.called

Copilot · 2025-12-08T03:44:09Z

tests/metrics/test_trace.py

+
+            # Should log warnings but not crash
+            # Check if warning was called (may not always be called depending on implementation)
+            pass


Unnecessary 'pass' statement.

Suggested change

pass

Jiang-Jia-Jun requested a review from Copilot December 8, 2025 03:40

Copilot started reviewing on behalf of Jiang-Jia-Jun December 8, 2025 03:40 View session

Copilot AI reviewed Dec 8, 2025

View reviewed changes

xiaolei373 force-pushed the support_tracing_feature_part1 branch 3 times, most recently from 16220c9 to 34ad14f Compare December 8, 2025 09:52

[Feature]support trace part1

bc1decd

xiaolei373 force-pushed the support_tracing_feature_part1 branch from 34ad14f to bc1decd Compare December 8, 2025 10:50

	# 统一填充 reqs_context 的 Root Span 信息
	# Consistently populate the Root Span information in reqs_context

		# trace_carrier = tracing.trace_get_proc_propagate_context(rid=rid)

		self._processor.force_flush(timeout_millis)


		def lable_span(request):

[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 #5417

Are you sure you want to change the base?

[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 #5417

Conversation

xiaolei373 commented Dec 7, 2025

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Dec 7, 2025

Uh oh!

codecov-commenter commented Dec 7, 2025

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants