-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
User request
MCP tools do not report failed status in tracing in case of errors; they always appear successful. The shell tool is properly displayed as failed when errors occur.
Summary
MCP tool spans are marked success even when the tool fails (either transport error or logical failure). Shell tool spans correctly show failure. This leads to misleading traces.
Root cause (from research)
- Tool error states are normalized in
packages/platform-server/src/llm/reducers/callTools.llm.reducer.tswithinexecuteToolCall(...). - For shell tools, non-zero exit mapping leads to error status at the tracing layer.
- For MCP tools, exceptions or
isError: trueresponses are converted intoToolCallResultwithstatus: 'error', but the span status is not being set to ERROR at the tracing boundary. As a result, spans end as OK/success.
Proposed fix
Implement span status/exception handling at the orchestration point where tool results are normalized:
- Location:
packages/platform-server/src/llm/reducers/callTools.llm.reducer.tsinexecuteToolCall(...). - Behavior:
- If
tool.execute(...)throws (e.g.,McpError):- record exception on the tool span (message, type, stack if allowed)
- set span status to ERROR
- attach attributes:
tool.name,tool.call_id,error.type,error.message, optionalerror.stack
- If no exception but
response.status === 'error'(logical failure):- set span status to ERROR
- add an event (e.g.,
tool.error) witherror_codeandmessage - attach attributes like
tool.error_codeandtool.retriable(if available)
- MCP-specific: when
err instanceof McpError, include additional attributes likemcp.error_code(if present) andtool.source = "mcp".
- If
- Optional: add lower-level spans in
packages/platform-server/src/nodes/mcp/localMcpServer.node.tscallTool(...)for richer context, but the primary correctness fix must be in the reducer so logical failures are captured.
Acceptance criteria
- Any MCP tool failure (exception or logical failure) results in a span with status ERROR.
- Shell tool spans continue to report ERROR on non-zero exit codes (no regression).
- Spans include meaningful error attributes and an exception event for thrown cases.
- Run-event records continue to reflect correct tool_execution error status.
Validation plan
- Add tests that exercise: (1) MCP tool throwing (
McpError), (2) MCP logical failure path, (3) shell non-zero exit regression. - Tests assert that tool_execution end state is error and that exported spans have status ERROR with appropriate attributes/events (use a test exporter or tracing abstraction as available).
- Manual verification: run platform with tracing exporter enabled, trigger a failing MCP tool and a failing shell command, and confirm spans are marked ERROR with expected metadata.
Affected code
- Primary:
packages/platform-server/src/llm/reducers/callTools.llm.reducer.ts - MCP node (optional/richer):
packages/platform-server/src/nodes/mcp/localMcpServer.node.ts - Types:
McpError,ToolCallResult,ToolExecStatus,ToolCallErrorCode
Notes
- Ensure consistency with existing shell tool error mapping.
- Keep error messages sanitized and avoid leaking sensitive data in spans.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels