Production-grade observability for voice AI pipelines.
VoiceMon implements the 4-Layer Voice Observability Framework to give you full-stack monitoring of voice agents built on LiveKit, Pipecat, or Vapi β from network jitter to task completion.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β YOUR VOICE AGENT β
β (LiveKit Agents / Pipecat / Vapi) β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ€
β voicemon SDK β 1-line integration β
β ββββββββββββββββββββββββ β instrument_livekit(session, col) β
β β VoiceMonCollector βββββββΆ Redis Streams βββΆ Worker β
β ββββββββββββββββββββββββ β β β β
β β OTel Spans β βΌ βΌ β
β βΌ β TimescaleDB Prometheus β
β Jaeger/Tempo β β β β
β β βΌ βΌ β
β β Streamlit Grafana β
β β (Analytics) (Ops) β
β β β β β
β β ββββββββ¬ββββββββββββ β
β β βΌ β
β β Slack / PagerDuty β
β β (Alert Routing) β
ββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββ
| Layer | What it captures | VoiceMon metrics |
|---|---|---|
| L1: Infrastructure | Network, codec, transport | Jitter, packet loss, MOS score, bitrate |
| L2: Execution | STT β LLM β TTS pipeline | Per-stage latency, confidence, tokens, TTFT/TTFB |
| L3: User Experience | Perceived quality | E2E latency, interruptions, silence ratio, TTFW |
| L4: Outcome | Business results | Task success, CSAT, resolution type, cost |
pip install voicemon # Core SDK
pip install voicemon[livekit] # + LiveKit integration
pip install voicemon[pipecat] # + Pipecat integration
pip install voicemon[vapi] # + Vapi integration
pip install voicemon[dashboard] # + Streamlit dashboard
pip install voicemon[all] # Everythingfrom livekit.agents import AgentSession
from voicemon import VoiceMonCollector
from voicemon.integrations.livekit import instrument_livekit
collector = VoiceMonCollector()
session = AgentSession(...)
# One-line instrumentation
session_id = instrument_livekit(session, collector, agent_id="my-agent")from pipecat.pipeline.pipeline import Pipeline
from voicemon import VoiceMonCollector
from voicemon.integrations.pipecat import VoiceMonPipecatObserver
collector = VoiceMonCollector()
observer = VoiceMonPipecatObserver(collector, agent_id="my-pipecat-agent")
pipeline = Pipeline([...], observers=[observer])from fastapi import FastAPI
from voicemon import VoiceMonCollector
from voicemon.integrations.vapi import create_vapi_router
app = FastAPI()
collector = VoiceMonCollector()
app.include_router(create_vapi_router(collector, webhook_secret="your-secret"))from voicemon.integrations.vapi import VapiClient
client = VapiClient(api_key="your-key", collector=collector)
session_ids = await client.poll_recent_calls(minutes=5)# Start full observability stack
docker compose up -d
# Initialize the database schema
docker compose exec timescale psql -U voicemon -d voicemon -f /schema/schema.sql
# Access dashboards
open http://localhost:3000 # Grafana (admin/voicemon)
open http://localhost:8501 # Streamlit AnalyticsThis starts:
- TimescaleDB β time-series + relational storage for sessions/turns/events
- Redis β stream-based event ingestion with consumer groups
- Prometheus β SLO metrics and alerting
- Grafana β operational dashboards with pre-built voice panels
- Streamlit β analytical dashboard with call replay + drift detection
- Worker β async Redis consumer for aggregation and alert evaluation
Generate realistic voice call telemetry to see the dashboards in action:
# Console output (no infrastructure needed)
python -m tests.demo_simulator --sessions 50
# With Redis export (requires docker compose)
python -m tests.demo_simulator --sessions 100 --redisVoiceMon uses environment variables or VoiceMonConfig:
from voicemon.core.config import VoiceMonConfig
config = VoiceMonConfig(
redis={"url": "redis://localhost:6379/0"},
timescale={"dsn": "postgresql://voicemon:voicemon@localhost:5432/voicemon"},
slack={"webhook_url": "https://hooks.slack.com/..."},
pagerduty={"routing_key": "your-routing-key"},
)| Metric | OK | Warning | Critical |
|---|---|---|---|
| E2E Latency | < 800ms | < 1200ms | > 1800ms |
| STT Latency | < 300ms | < 500ms | > 1000ms |
| LLM TTFT | < 500ms | < 800ms | > 2000ms |
| TTS TTFB | < 200ms | < 300ms | > 800ms |
| STT Confidence | > 0.85 | > 0.7 | < 0.5 |
Alerts are defined in alert_rules.yaml with severity-based routing:
- Info/Warning β Slack
#voicemon-alerts - Critical β Slack + @oncall mention
- P0 β Slack + PagerDuty incident
rules:
- name: e2e_latency_critical
metric: e2e_latency_ms
operator: ">"
threshold: 1800
severity: p0
cooldown_minutes: 2
notify: [slack, pagerduty]Voice Agent (LiveKit/Pipecat/Vapi)
β
βΌ
VoiceMonCollector (SDK)
β β
βΌ βΌ
OTel Exporters
Spans β
β ββββ΄βββ
βΌ βΌ βΌ
Jaeger Redis Prometheus
Streams /metrics
β β
βΌ βΌ
βββββββββββ Grafana
β Worker β (Ops Dashboard)
β Process β
ββββββ¬βββββ
β
ββββββ΄βββββ
βΌ βΌ
TimescaleDB Alert Engine
β β
βΌ ββββ΄βββ
Streamlit βΌ βΌ
(Analytics) Slack PagerDuty
| Decision | Choice | Why |
|---|---|---|
| Primary DB | TimescaleDB | Voice data is relational (sessionsβturnsβevents need JOINs) + native time-series |
| Ingestion | Redis Streams | Lightweight, consumer groups for horizontal scaling, at-least-once delivery |
| Metrics | Prometheus | Industry standard for SLOs, native histogram quantiles, Alertmanager |
| Ops Dashboard | Grafana | Universal, supports both Prometheus + TimescaleDB datasources |
| Analytics | Streamlit | Python-native, custom call replay + drift detection UX |
| Anomaly Detection | Z-score | Simple, no ML dependencies, catches latency spikes in real-time |
voicemon/
βββ core/
β βββ models.py # Pydantic v2 data models (4-layer framework)
β βββ config.py # Configuration with industry-standard thresholds
β βββ collector.py # Central telemetry hub
β βββ otel.py # OpenTelemetry bridge with voice-aware spans
βββ integrations/
β βββ livekit.py # LiveKit AgentSession event hooks
β βββ pipecat.py # Pipecat BaseObserver frame interception
β βββ vapi.py # Vapi webhooks + REST client
βββ exporters/
β βββ base.py # Exporter interface + ConsoleExporter
β βββ redis.py # Redis Streams with consumer groups
β βββ prometheus.py # Prometheus histograms/counters/gauges
βββ storage/
β βββ schema.sql # TimescaleDB schema with hypertables + aggregates
β βββ timescale.py # Async TimescaleDB client
βββ workers/
β βββ processor.py # Redis consumer β aggregation β anomaly detection
βββ alerts/
β βββ engine.py # YAML rule engine with cooldowns
β βββ slack.py # Slack Block Kit notifications
β βββ pagerduty.py # PagerDuty Events API v2
βββ dashboards/
βββ grafana/ # Pre-built Grafana dashboard JSON
βββ streamlit_app.py # Analytics dashboard with call replay
pip install -e ".[dev]"
pytest
ruff check .
mypy voicemon/Apache-2.0