Skip to content

Commit 1e0264d

Browse files
andrewm4894claude
andcommitted
refactor(otel): Add OtelInstrumentationPattern enum for provider pattern detection
- Add OtelInstrumentationPattern enum (V1_ATTRIBUTES, V2_TRACES_AND_LOGS) - Providers declare pattern via get_instrumentation_pattern() method - Add detect_provider() for centralized provider detection - Remove hardcoded @mastra/otel checks from transformer.py - Update README with Provider Reference section and pattern documentation - Add test_ingestion_parity.py for v1/v2 parity testing - Document Mastra behavior (no conversation history accumulation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent e9f3cff commit 1e0264d

File tree

10 files changed

+799
-62
lines changed

10 files changed

+799
-62
lines changed

docker-compose.base.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -264,8 +264,6 @@ services:
264264
KAFKA_HOSTS: kafka:9092
265265
JWT_SECRET: '<randomly generated secret key>'
266266
KAFKA_TOPIC: logs_ingestion
267-
ports:
268-
- '4318:4318'
269267
networks:
270268
- otel_network
271269
- default

posthog/urls.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,9 +212,9 @@ def proxy_logs_to_capture_service(request: HttpRequest) -> HttpResponse:
212212
*ee_urlpatterns,
213213
# api
214214
# OpenTelemetry traces ingestion for LLM Analytics
215-
path("api/projects/<int:project_id>/ai/otel/v1/traces", csrf_exempt(otel_traces_endpoint)),
215+
path("api/projects/<int:project_id>/ai/otel/traces", csrf_exempt(otel_traces_endpoint)),
216216
# OpenTelemetry logs ingestion for LLM Analytics
217-
path("api/projects/<int:project_id>/ai/otel/v1/logs", csrf_exempt(otel_logs_endpoint)),
217+
path("api/projects/<int:project_id>/ai/otel/logs", csrf_exempt(otel_logs_endpoint)),
218218
# OpenTelemetry logs proxy to capture-logs service (legacy)
219219
path("i/v1/logs", proxy_logs_to_capture_service),
220220
path("api/environments/<int:team_id>/progress/", progress),

products/llm_analytics/backend/api/otel/README.md

Lines changed: 96 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -124,9 +124,15 @@ Determines event type based on span characteristics:
124124
- `$ai_trace`: Root spans (no parent) for v2 frameworks
125125
- `$ai_span`: All other spans, including root spans from v1 frameworks
126126

127-
**v1 Detection**: Checks for `prompt` or `completion` attributes OR framework scope name (e.g., `@mastra/otel`). v1 spans bypass the event merger.
127+
**Pattern Detection**: Uses `OtelInstrumentationPattern` enum to determine routing:
128128

129-
**Event Type Logic**: For v1 frameworks like Mastra, root spans are marked as `$ai_span` (not `$ai_trace`) to ensure they appear in the tree hierarchy. This is necessary because `TraceQueryRunner` filters out `$ai_trace` events from the events array.
129+
1. Provider declares pattern via `get_instrumentation_pattern()` (most reliable)
130+
2. Span has `prompt` or `completion` attributes (indicates V1 data present)
131+
3. Default to V2 (safer - waits for logs rather than sending incomplete)
132+
133+
V1 spans bypass the event merger and are sent immediately.
134+
135+
**Event Type Logic**: For V1 frameworks, root spans are marked as `$ai_span` (not `$ai_trace`) to ensure they appear in the tree hierarchy. This is necessary because `TraceQueryRunner` filters out `$ai_trace` events from the events array.
130136

131137
### logs_transformer.py
132138

@@ -153,12 +159,16 @@ Attribute extraction modules implementing semantic conventions:
153159

154160
**posthog_native.py**: Extracts PostHog-specific attributes prefixed with `posthog.ai.*`. These take precedence in the waterfall.
155161

156-
**genai.py**: Extracts OpenTelemetry GenAI semantic convention attributes (`gen_ai.*`). Handles indexed message fields by collecting attributes like `gen_ai.prompt.0.role` into structured message arrays. Supports provider-specific transformations for frameworks that use custom OTEL formats.
162+
**genai.py**: Extracts OpenTelemetry GenAI semantic convention attributes (`gen_ai.*`). Handles indexed message fields by collecting attributes like `gen_ai.prompt.0.role` into structured message arrays. Provides `detect_provider()` function for centralized provider detection. Supports provider-specific transformations for frameworks that use custom OTEL formats.
157163

158164
**providers/**: Framework-specific transformers for handling custom OTEL formats:
159165

160-
- **base.py**: Abstract base class defining the provider transformer interface (`can_handle()`, `transform_prompt()`, `transform_completion()`)
161-
- **mastra.py**: Transforms Mastra's wrapped message format (e.g., `{"messages": [...]}` for input, `{"text": "...", "files": [], ...}` for output) into standard PostHog format. Detected by instrumentation scope name `@mastra/otel`.
166+
- **base.py**: Abstract base class defining the provider transformer interface:
167+
- `can_handle()`: Detect if transformer handles this span
168+
- `transform_prompt()`: Transform provider-specific prompt format
169+
- `transform_completion()`: Transform provider-specific completion format
170+
- `get_instrumentation_pattern()`: Declare V1 or V2 pattern (returns `OtelInstrumentationPattern` enum)
171+
- **mastra.py**: Transforms Mastra's wrapped message format (e.g., `{"messages": [...]}` for input, `{"text": "...", "files": [], ...}` for output) into standard PostHog format. Detected by instrumentation scope name `@mastra/otel`. Declares `V1_ATTRIBUTES` pattern.
162172

163173
## Event Schema
164174

@@ -195,13 +205,13 @@ All events conform to the PostHog LLM Analytics schema:
195205

196206
## API Endpoints
197207

198-
**Traces**: `POST /api/projects/{project_id}/ai/otel/v1/traces`
208+
**Traces**: `POST /api/projects/{project_id}/ai/otel/traces`
199209

200210
- Content-Type: `application/x-protobuf`
201211
- Authorization: `Bearer {project_api_key}`
202212
- Accepts OTLP trace payloads
203213

204-
**Logs**: `POST /api/projects/{project_id}/ai/otel/v1/logs`
214+
**Logs**: `POST /api/projects/{project_id}/ai/otel/logs`
205215

206216
- Content-Type: `application/x-protobuf`
207217
- Authorization: `Bearer {project_api_key}`
@@ -228,12 +238,12 @@ from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExport
228238
from opentelemetry.exporter.otlp.proto.http._log_exporter import OTLPLogExporter
229239

230240
trace_exporter = OTLPSpanExporter(
231-
endpoint=f"{posthog_host}/api/projects/{project_id}/ai/otel/v1/traces",
241+
endpoint=f"{posthog_host}/api/projects/{project_id}/ai/otel/traces",
232242
headers={"Authorization": f"Bearer {api_key}"}
233243
)
234244

235245
log_exporter = OTLPLogExporter(
236-
endpoint=f"{posthog_host}/api/projects/{project_id}/ai/otel/v1/logs",
246+
endpoint=f"{posthog_host}/api/projects/{project_id}/ai/otel/logs",
237247
headers={"Authorization": f"Bearer {api_key}"}
238248
)
239249
```
@@ -252,24 +262,25 @@ The merger returns None on first arrival rather than blocking. This prevents the
252262

253263
v2 can send multiple log events in a single HTTP request. The ingestion layer groups these by (trace_id, span_id) and accumulates their properties before calling the merger. This prevents race conditions where partial log data gets merged before all logs arrive.
254264

255-
### v1/v2 Detection
265+
### Pattern Detection via Provider Transformers
256266

257-
Rather than requiring explicit configuration, the transformer auto-detects instrumentation version by:
267+
Rather than hardcoding framework names, the transformer uses a layered detection approach:
258268

259-
1. Checking for `prompt` or `completion` attributes (after extraction)
260-
2. Detecting framework via instrumentation scope name (e.g., `@mastra/otel`)
269+
1. **Provider declaration** (most reliable): Providers implement `get_instrumentation_pattern()` returning `OtelInstrumentationPattern.V1_ATTRIBUTES` or `V2_TRACES_AND_LOGS`
270+
2. **Content detection** (fallback): Span has `prompt` or `completion` attributes after extraction
271+
3. **Safe default**: Unknown providers default to V2 (waits for logs rather than sending incomplete events)
261272

262-
This allows both patterns to coexist without configuration, and supports frameworks that don't follow standard attribute conventions.
273+
This allows both patterns to coexist without configuration, and new providers only need to declare their pattern in one place.
263274

264275
### Provider Transformers
265276

266277
Some frameworks (like Mastra) wrap OTEL data in custom structures that don't match standard GenAI conventions. Provider transformers detect these frameworks (via instrumentation scope or attribute prefixes) and unwrap their data into standard format. This keeps framework-specific logic isolated while maintaining compatibility with the core transformer pipeline.
267278

268279
**Example**: Mastra wraps prompts as `{"messages": [{"role": "user", "content": [...]}]}` where content is an array of `{"type": "text", "text": "..."}` objects. The Mastra transformer unwraps this into standard `[{"role": "user", "content": "..."}]` format.
269280

270-
### Event Type Determination for v1 Frameworks
281+
### Event Type Determination for V1 Frameworks
271282

272-
v1 frameworks create root spans that should appear in the tree hierarchy alongside their children. These root spans are marked as `$ai_span` (not `$ai_trace`) because `TraceQueryRunner` filters out `$ai_trace` events from the events array. This ensures v1 framework traces display correctly with proper parent-child relationships in the UI.
283+
V1 frameworks create root spans that should appear in the tree hierarchy alongside their children. The `determine_event_type()` function checks `provider.get_instrumentation_pattern()` and marks V1 root spans as `$ai_span` (not `$ai_trace`) because `TraceQueryRunner` filters out `$ai_trace` events from the events array. This ensures V1 framework traces display correctly with proper parent-child relationships in the UI.
273284

274285
### TTL-Based Cleanup
275286

@@ -282,8 +293,9 @@ The event merger uses 60-second TTL on cache entries. This automatically cleans
282293
Create a new transformer in `conventions/providers/`:
283294

284295
```python
285-
from .base import ProviderTransformer
296+
from .base import OtelInstrumentationPattern, ProviderTransformer
286297
from typing import Any
298+
import json
287299

288300
class CustomFrameworkTransformer(ProviderTransformer):
289301
"""Transform CustomFramework's OTEL format."""
@@ -293,6 +305,12 @@ class CustomFrameworkTransformer(ProviderTransformer):
293305
scope_name = scope.get("name", "")
294306
return scope_name == "custom-framework-scope"
295307

308+
def get_instrumentation_pattern(self) -> OtelInstrumentationPattern:
309+
"""Declare V1 or V2 pattern - determines event routing."""
310+
# V1: All data in span attributes, send immediately
311+
# V2: Metadata in spans, content in logs, requires merge
312+
return OtelInstrumentationPattern.V1_ATTRIBUTES
313+
296314
def transform_prompt(self, prompt: Any) -> Any:
297315
"""Transform wrapped prompt format to standard."""
298316
if not isinstance(prompt, str):
@@ -369,6 +387,67 @@ Extend `build_event_properties()` in `transformer.py` to map additional attribut
369387
- **Memory**: Redis cache bounded by TTL (60s max retention)
370388
- **Concurrency**: Simple Redis operations enable fast merging with minimal race condition risk
371389

390+
## Provider Reference
391+
392+
Different LLM frameworks implement OTEL instrumentation with their own nuances. This section documents known provider behaviors to help understand what to expect from each.
393+
394+
### Mastra (`@mastra/otel`)
395+
396+
**Detection**: Instrumentation scope name `@mastra/otel` or `mastra.*` attribute prefix
397+
398+
**OTEL Pattern**: `V1_ATTRIBUTES` (all data in span attributes)
399+
400+
**Key Behaviors**:
401+
402+
- **No conversation history accumulation**: Each `agent.generate()` call creates a separate, independent trace. The `gen_ai.prompt` only contains that specific call's input (typically system message + current user message), not the accumulated conversation history from previous turns.
403+
- **Wrapped message format**: Prompts are JSON-wrapped as `{"messages": [{"role": "user", "content": [{"type": "text", "text": "..."}]}]}` where content is an array of typed objects.
404+
- **Wrapped completion format**: Completions are JSON-wrapped as `{"text": "...", "files": [], "warnings": [], ...}`.
405+
- **Multi-turn traces**: In a multi-turn conversation, you'll see multiple separate traces (one per `agent.generate()` call), each showing only that turn's input/output.
406+
407+
**Implications for PostHog**:
408+
409+
- Each turn appears as a separate trace in LLM Analytics
410+
- To see full conversation context, users need to look at the sequence of traces
411+
- The Mastra transformer unwraps the custom JSON format into standard PostHog message arrays
412+
413+
**Example**: A 4-turn conversation produces 4 traces, where turn 4's input only shows "Thanks, bye!" (not the previous greeting, weather query, and joke request).
414+
415+
### OpenTelemetry Instrumentation OpenAI v1 (`opentelemetry-instrumentation-openai`)
416+
417+
**Detection**: Span attributes with indexed prompt/completion fields (no custom provider transformer needed - uses standard GenAI conventions)
418+
419+
**OTEL Pattern**: `V1_ATTRIBUTES` (all data in span attributes)
420+
421+
**Key Behaviors**:
422+
423+
- **Full conversation in each call**: The `gen_ai.prompt.*` attributes contain all messages passed to the API call
424+
- **Indexed attributes**: Messages use `gen_ai.prompt.0.role`, `gen_ai.prompt.0.content`, etc.
425+
- **Direct attribute format**: No JSON wrapping, values are stored directly as span attributes
426+
427+
**Implications for PostHog**:
428+
429+
- If the application maintains conversation state, later turns show full history
430+
- Each trace is self-contained with complete context
431+
432+
### OpenTelemetry Instrumentation OpenAI v2 (`opentelemetry-instrumentation-openai-v2`)
433+
434+
**Detection**: Spans without prompt/completion attributes, accompanied by OTEL log events (no custom provider transformer needed - detected by absence of V1 content)
435+
436+
**OTEL Pattern**: `V2_TRACES_AND_LOGS` (traces + logs separated)
437+
438+
**Key Behaviors**:
439+
440+
- **Split data model**: Traces contain metadata (model, tokens, timing), logs contain message content
441+
- **Log events**: Uses `gen_ai.user.message`, `gen_ai.assistant.message`, `gen_ai.tool.message`, etc.
442+
- **Full conversation in each call**: Like v1, if the app maintains state, messages accumulate
443+
- **Requires merge**: PostHog's event merger combines traces and logs into complete events
444+
445+
**Implications for PostHog**:
446+
447+
- Slightly higher latency due to merge process
448+
- Supports streaming better than v1
449+
- Both traces and logs endpoints must be configured
450+
372451
## References
373452

374453
- [OpenTelemetry GenAI Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/)

products/llm_analytics/backend/api/otel/conventions/genai.py

Lines changed: 32 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,10 @@
1111
"""
1212

1313
from collections import defaultdict
14-
from typing import Any
14+
from typing import TYPE_CHECKING, Any
1515

16-
from .providers import PROVIDER_TRANSFORMERS
16+
if TYPE_CHECKING:
17+
from .providers.base import ProviderTransformer
1718

1819

1920
def has_genai_attributes(span: dict[str, Any]) -> bool:
@@ -65,6 +66,27 @@ def _extract_indexed_messages(attributes: dict[str, Any], prefix: str) -> list[d
6566
return messages if messages else None
6667

6768

69+
def detect_provider(span: dict[str, Any], scope: dict[str, Any] | None = None) -> "ProviderTransformer | None":
70+
"""
71+
Detect which provider transformer handles this span.
72+
73+
Args:
74+
span: Parsed OTEL span
75+
scope: Instrumentation scope info
76+
77+
Returns:
78+
Matching ProviderTransformer instance, or None if no provider matches
79+
"""
80+
from .providers import PROVIDER_TRANSFORMERS
81+
82+
scope = scope or {}
83+
for transformer_class in PROVIDER_TRANSFORMERS:
84+
transformer = transformer_class()
85+
if transformer.can_handle(span, scope):
86+
return transformer
87+
return None
88+
89+
6890
def extract_genai_attributes(span: dict[str, Any], scope: dict[str, Any] | None = None) -> dict[str, Any]:
6991
"""
7092
Extract GenAI semantic convention attributes from span.
@@ -90,18 +112,14 @@ def extract_genai_attributes(span: dict[str, Any], scope: dict[str, Any] | None
90112
result: dict[str, Any] = {}
91113

92114
# Detect provider-specific transformer
93-
provider_transformer = None
94-
for transformer_class in PROVIDER_TRANSFORMERS:
95-
transformer = transformer_class()
96-
if transformer.can_handle(span, scope):
97-
provider_transformer = transformer
98-
logger.info(
99-
"provider_transformer_detected",
100-
provider=transformer.get_provider_name(),
101-
scope_name=scope.get("name"),
102-
span_name=span.get("name"),
103-
)
104-
break
115+
provider_transformer = detect_provider(span, scope)
116+
if provider_transformer:
117+
logger.info(
118+
"provider_transformer_detected",
119+
provider=provider_transformer.get_provider_name(),
120+
scope_name=scope.get("name"),
121+
span_name=span.get("name"),
122+
)
105123

106124
# Model (prefer request, fallback to response, then system)
107125
model = (

products/llm_analytics/backend/api/otel/conventions/providers/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
specific OTEL format quirks and normalizes to PostHog format.
66
"""
77

8-
from .base import ProviderTransformer
8+
from .base import OtelInstrumentationPattern, ProviderTransformer
99
from .mastra import MastraTransformer
1010

1111
# Registry of all available provider transformers
@@ -16,6 +16,7 @@
1616
]
1717

1818
__all__ = [
19+
"OtelInstrumentationPattern",
1920
"ProviderTransformer",
2021
"MastraTransformer",
2122
"PROVIDER_TRANSFORMERS",

products/llm_analytics/backend/api/otel/conventions/providers/base.py

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,39 @@
33
44
Provider transformers handle framework/library-specific OTEL formats
55
and normalize them to PostHog's standard format.
6+
7+
When adding a new provider transformer, document these aspects:
8+
1. Detection method (scope name, attribute prefix, etc.)
9+
2. OTEL pattern (v1 attributes-only vs v2 traces+logs)
10+
3. Message format quirks (JSON wrapping, content arrays, etc.)
11+
4. Conversation history behavior (accumulated vs per-call)
12+
5. Any other notable behaviors
13+
14+
See mastra.py for an example of well-documented provider behavior.
615
"""
716

817
from abc import ABC, abstractmethod
18+
from enum import Enum
919
from typing import Any
1020

1121

22+
class OtelInstrumentationPattern(Enum):
23+
"""
24+
OTEL instrumentation patterns for LLM frameworks.
25+
26+
V1_ATTRIBUTES: All data (metadata + content) in span attributes
27+
- Send events immediately, no waiting for logs
28+
- Example: opentelemetry-instrumentation-openai, Mastra
29+
30+
V2_TRACES_AND_LOGS: Metadata in spans, content in separate log events
31+
- Requires event merger to combine traces + logs
32+
- Example: opentelemetry-instrumentation-openai-v2
33+
"""
34+
35+
V1_ATTRIBUTES = "v1_attributes"
36+
V2_TRACES_AND_LOGS = "v2_traces_and_logs"
37+
38+
1239
class ProviderTransformer(ABC):
1340
"""
1441
Base class for provider-specific OTEL transformers.
@@ -65,3 +92,15 @@ def get_provider_name(self) -> str:
6592
Human-readable provider name
6693
"""
6794
return self.__class__.__name__.replace("Transformer", "")
95+
96+
def get_instrumentation_pattern(self) -> OtelInstrumentationPattern:
97+
"""
98+
Get the OTEL instrumentation pattern this provider uses.
99+
100+
Override in subclass to declare the pattern. Default is V2_TRACES_AND_LOGS
101+
for safety - better to wait for logs than to send incomplete events.
102+
103+
Returns:
104+
The instrumentation pattern enum value
105+
"""
106+
return OtelInstrumentationPattern.V2_TRACES_AND_LOGS

products/llm_analytics/backend/api/otel/conventions/providers/mastra.py

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,28 @@
44
Handles Mastra's OTEL format which wraps messages in custom structures:
55
- Input: {"messages": [{"role": "user", "content": [...]}]}
66
- Output: {"files": [], "text": "...", "warnings": [], ...}
7+
8+
Provider Behavior Notes:
9+
------------------------
10+
Mastra uses the @mastra/otel instrumentation scope and sends OTEL data in v1 pattern
11+
(all data in span attributes, no separate log events).
12+
13+
Key characteristic: Mastra does NOT accumulate conversation history across calls.
14+
Each `agent.generate()` call creates a separate, independent trace containing only
15+
that turn's input (system message + current user message) and output. This means:
16+
17+
- A 4-turn conversation produces 4 separate traces
18+
- Turn 4's trace only shows "Thanks, bye!" as input, not previous turns
19+
- To see full conversation context, users must look at the sequence of traces
20+
21+
This is expected Mastra behavior, not a limitation of our ingestion. The framework
22+
treats each generate() call as an independent operation.
723
"""
824

925
import json
1026
from typing import Any
1127

12-
from .base import ProviderTransformer
28+
from .base import OtelInstrumentationPattern, ProviderTransformer
1329

1430

1531
class MastraTransformer(ProviderTransformer):
@@ -36,6 +52,10 @@ def can_handle(self, span: dict[str, Any], scope: dict[str, Any]) -> bool:
3652
attributes = span.get("attributes", {})
3753
return any(key.startswith("mastra.") for key in attributes.keys())
3854

55+
def get_instrumentation_pattern(self) -> OtelInstrumentationPattern:
56+
"""Mastra uses v1 pattern - all data in span attributes."""
57+
return OtelInstrumentationPattern.V1_ATTRIBUTES
58+
3959
def transform_prompt(self, prompt: Any) -> Any:
4060
"""
4161
Transform Mastra's wrapped input format.

0 commit comments

Comments
 (0)