Skip to content

Conversation

@satyadevai
Copy link
Collaborator

@satyadevai satyadevai commented Oct 30, 2025

In the existing haystack instrumentation, tracing was only applied within the Pipeline execution path. The instrumentation wrapped Pipeline methods, so component methods like run() or run_async() were dynamically wrapped only when they were executed as part of a pipeline. This design ensured that components inside a pipeline were traced properly, but if specific component is invoked directly (outside pipeline) is never traced because their execution did not go through the pipeline wrapping machanism. As a result, invoking Agent.run() directly created no spans at all, since the tracing logic was never attached to the Agent class itself.

Additionally, even when Agent.run() internally invoked Pipeline._run_component, the instrumentation still treated each sub-component call(LLM, Tools, Embeddings or other components) as an independent span. Instead of grouping them under a parent Agent.run span, every internal component run appeared as a separate trace segment. This happened because there was no root span created for Agent.run to serve as the parent for those internal operations—the pipeline instrumentation only captured the individual component spans without establishing hierarchical context.

Haystack will register all the components in component.registry, now we can wrap the all the registered components run and run_async methods at instrumentation time. This ensures that every component, including Agent gets traced consistently wheather it runs independently or within a pipeline. It also enables proper span hierarchy, allowing Agent.run to created as root span that encapsulates the nested component executions.

closes #2328

image image image image image **No Changes detected in Pipeline.run** image

Note

Wrap all registered Haystack components (including Agent) for tracing, update span names to reflect actual method (run/run_async), and add examples plus tests (with VCR cassettes).

  • Instrumentation:
    • Wrap all registered components from haystack.core.component.component.registry to trace run and run_async outside pipelines (includes Agent).
    • Change component span naming to ClassName.<method> using wrapped.__name__ (e.g., OpenAIChatGenerator.run_async, InMemoryBM25Retriever.run).
    • Ensure async/sync component methods are wrapped during pipeline execution as well.
  • Tests:
    • Add test_agent_run_component_spans (sync/async) validating parent Agent.<method> span and nested LLM/tool spans.
    • Add tests for individual components outside pipelines (InMemoryBM25Retriever.run/run_async).
    • Update expectations where spans now end with .run_async for async paths.
    • Add corresponding VCR cassettes for agent runs.
  • Examples:
    • New examples in examples/: agent_run.py, agent_with_pipeline.py, retriever_component_run.py demonstrating tracing setup and component/agent runs.

Written by Cursor Bugbot for commit 1cd56df. This will update automatically on new commits. Configure here.

@satyadevai satyadevai requested a review from a team as a code owner October 30, 2025 17:17
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Oct 30, 2025
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Oct 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[feature request] Add Agent.run root span and session hooks to Haystack instrumentation

1 participant