Skip to content

Conversation

poshinchen
Copy link
Contributor

@poshinchen poshinchen commented Oct 7, 2025

Description

  • Added timeToFirstByteMs into the spans and metrics
  • Updated semantic conventions for tool requests and response

Related Issues

#986

Documentation PR

N/A

Type of Change

New feature

Testing

How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@poshinchen poshinchen marked this pull request as ready for review October 7, 2025 16:38
@poshinchen poshinchen requested a review from a team October 7, 2025 16:38
@dbschmigelski dbschmigelski added this to the p milestone Oct 7, 2025
@poshinchen poshinchen changed the title feat(telemetry): added timeToFirstByteMs into spans and metrics feat(telemetry): updated semantic conventions, added timeToFirstByteMs into spans and metrics Oct 8, 2025
Copy link

codecov bot commented Oct 8, 2025

Codecov Report

❌ Patch coverage is 80.48780% with 8 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/strands/telemetry/tracer.py 73.68% 3 Missing and 2 partials ⚠️
src/strands/event_loop/streaming.py 84.61% 1 Missing and 1 partial ⚠️
src/strands/telemetry/metrics.py 75.00% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!


latencyMs: int
latencyMs: Required[int]
timeToFirstByteMs: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you edit to be as precise as possible in the description what timeToFirstByteMs actually means here It looks like the definition is actually ("contentBlockDelta" in chunk or "contentBlockStart" in chunk)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Should the checking logic look for the first chunk returned from the model provider, or for the first "contentBlockDelta"/"contentBlockStrat" from the model provider?

chunks = model.stream(messages, tool_specs if tool_specs else None, system_prompt)

async for event in process_stream(chunks):
async for event in process_stream(chunks, start_time):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so we need start_time passed in rather than just starting it in process_stream because we are actually calling mode.stream in stream_messages?

I guess this comes down to clearly knowing what customers want, and is related to the comment about the Metrics type. Meaning should this go before BeforeModelCallEvent or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so we need start_time passed in rather than just starting it in process_stream because we are actually calling mode.stream in stream_messages?

Yes

I guess this comes down to clearly knowing what customers want, and is related to the comment about the Metrics type. Meaning should this go before BeforeModelCallEvent or not

correct, the current implementation is to count the latency of sending / getting the first chunk from the model.


latencyMs: int
latencyMs: Required[int]
timeToFirstByteMs: int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good callout. Should the checking logic look for the first chunk returned from the model provider, or for the first "contentBlockDelta"/"contentBlockStrat" from the model provider?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants