-
Notifications
You must be signed in to change notification settings - Fork 49
stt fallback #319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release
Are you sure you want to change the base?
stt fallback #319
Conversation
WalkthroughThe pull request introduces STT (speech-to-text) provider fallback capability to the voice agent system. It adds error detection in the pipeline, modularizes STT service creation with provider-specific factories, introduces new fallback-aware entrypoints, and extends configuration to enable and prioritize STT fallback providers. Changes
Sequence DiagramsequenceDiagram
participant User
participant Pipeline
participant STTService
participant ErrorDetector
participant Agent
participant FallbackHandler
User->>Pipeline: Audio frame
Pipeline->>STTService: Process STT request
STTService-->>Pipeline: ErrorFrame (STT error)
Pipeline->>ErrorDetector: Route ErrorFrame
rect rgb(220, 53, 69, 0.1)
Note over ErrorDetector: STT error detected
ErrorDetector->>FallbackHandler: trigger restart_callback()
end
FallbackHandler->>Agent: Switch to next STT provider
Agent->>STTService: Reinitialize with fallback provider
Pipeline->>STTService: Retry audio processing
STTService-->>Pipeline: STT result (success)
Pipeline->>User: Response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes The changes introduce significant architectural modifications across multiple files with heterogeneous purposes: orchestration logic refactoring, new processor component, provider factory pattern implementation, and configuration extensions. While individual components are moderately complex, the interconnected nature of fallback flow wiring, error detection mechanisms, and provider factory abstractions demands careful review of the integration points and control flow across the pipeline. Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
app/agents/voice/automatic/__init__.py (1)
641-656: ptt_vad_filter may be undefined in message handlers; guard its usage.If DISABLE_VAD_FOR_PTT is False, handlers will NameError on PTT events. Initialize to None and check before use.
- stt_mute_filter = None - tool_call_processor = None + stt_mute_filter = None + tool_call_processor = None + ptt_vad_filter = None @@ - elif message_type == "ptt-start": + elif message_type == "ptt-start" and ptt_vad_filter: @@ - elif message_type == "ptt-end": + elif message_type == "ptt-end" and ptt_vad_filter: @@ - elif message_type == "ptt-sync": + elif message_type == "ptt-sync" and ptt_vad_filter:Optional: else log “PTT message ignored; VAD filter disabled”.
Also applies to: 776-804
app/agents/voice/automatic/stt/__init__.py (1)
180-237: Verification confirms the review comment is accurate—callers need to handle init errors as fallback triggers.Analysis of direct callers:
- websocket_bot.py:238: No error handling; exceptions fatal (separate module context)
- automatic/init.py:283 (fallback path): Catches only
STTRestartException;ValueErrorfromget_stt_service()falls through to generic handler and re-raises as fatal- automatic/init.py:312 (legacy path): No error handling; exceptions fatal
The fallback path at line 283 should catch init errors (like
ValueError: "All STT providers exhausted") and treat them as "try next provider" rather than fatal. Currently, such exceptions hit the generic handler and propagate up, stopping the fallback loop prematurely.The review comment is correct: keep
get_stt_service()raising, but fix callers to catch and continue on init errors.
🧹 Nitpick comments (5)
app/agents/voice/automatic/stt/__init__.py (1)
16-34: Provider selection should skip duplicates and unknowns; compute ordered unique list once.Prevents retrying the same provider and tolerates misconfigured lists.
-def get_provider_for_attempt(fallback_attempt: int) -> Optional[str]: +def get_provider_for_attempt(fallback_attempt: int) -> Optional[str]: """ Get the STT provider to use for the given fallback attempt. @@ - if fallback_attempt == 0: - return config.STT_PROVIDER - - fallback_index = fallback_attempt - 1 - if fallback_index < len(config.STT_FALLBACK_PROVIDERS): - return config.STT_FALLBACK_PROVIDERS[fallback_index] - return None + # Build ordered unique provider list: primary + fallbacks + allowed = {"google", "assemblyai", "openai", "deepgram", "soniox"} + ordered = [] + primary = (config.STT_PROVIDER or "google").strip().lower() + if primary in allowed: + ordered.append(primary) + for p in config.STT_FALLBACK_PROVIDERS: + q = p.strip().lower() + if q in allowed and q not in ordered: + ordered.append(q) + return ordered[fallback_attempt] if 0 <= fallback_attempt < len(ordered) else Noneapp/agents/voice/automatic/processors/stt_error_detector.py (3)
25-37: Fix restart_callback type: it’s awaited; annotate as Awaitable.Prevents type-checker drift and accidental sync callbacks.
-from typing import Callable +from typing import Callable, Awaitable @@ - def __init__( - self, restart_callback: Callable[[], None], name: str = "STTErrorDetector" - ): + def __init__( + self, restart_callback: Callable[[], Awaitable[None]], name: str = "STTErrorDetector" + ):
39-57: Avoid leaking secrets in logs; redact sensitive substrings before logging.Errors often include API keys/tokens. Redact before logger.error.
- logger.error(f"STT error detected: {frame.error}") + msg = str(frame.error) + sanitized = msg.replace("api_key=", "api_key=***").replace("apikey=", "apikey=***") + logger.error(f"STT error detected: {sanitized}")
72-107: Keyword detection is broad; expect false positives. Prefer source metadata check when available.If ErrorFrame exposes origin/service name, check that first; keep keywords as fallback.
- # Check if error message contains any STT-related keywords + # Prefer structured source metadata if available + source = getattr(error_frame, "source", None) + if isinstance(source, str) and any(s in source.lower() for s in stt_services): + return True + # Fallback: keyword scan for keyword in stt_services + stt_error_patterns:app/agents/voice/automatic/__init__.py (1)
865-879: Clean up pipeline/transport on STTRestartException to avoid leaks before retry.Cancel the task (and close transport if supported) before bubbling up.
async def run_pipeline(): try: await runner.run(task) except asyncio.CancelledError: logger.info("Main task cancelled. Exiting gracefully.") except STTRestartException: # Re-raise STT restart exception to trigger fallback logger.info( "STT restart exception detected - propagating to fallback handler" ) - raise + try: + await task.cancel() + finally: + raise except Exception as e: logger.error(f"Pipeline runner error: {e}") - raise + try: + await task.cancel() + finally: + raise
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
app/agents/voice/automatic/__init__.py(11 hunks)app/agents/voice/automatic/processors/stt_error_detector.py(1 hunks)app/agents/voice/automatic/stt/__init__.py(2 hunks)app/core/config.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
app/agents/voice/automatic/stt/__init__.py (1)
app/agents/voice/breeze_buddy/workflows/order_confirmation/utils.py (1)
get_stt_service(13-65)
app/agents/voice/automatic/__init__.py (8)
app/agents/voice/automatic/processors/stt_error_detector.py (1)
STTErrorDetector(17-107)app/agents/voice/automatic/prompts/__init__.py (1)
get_system_prompt(26-59)app/agents/voice/automatic/stt/__init__.py (1)
get_stt_service(180-237)app/agents/voice/automatic/tools/__init__.py (1)
initialize_tools(19-135)app/agents/voice/automatic/tts/__init__.py (1)
get_tts_service(15-88)app/agents/voice/automatic/types/decoders.py (3)
decode_voice_name(14-21)decode_tts_provider(4-11)decode_mode(24-31)app/agents/voice/automatic/features/llm_wrapper.py (1)
LLMServiceWrapper(14-142)app/agents/voice/automatic/utils/session_context.py (2)
set_current_session_id(31-35)create_session_context(22-24)
🪛 Ruff (0.14.0)
app/agents/voice/automatic/stt/__init__.py
39-39: Avoid specifying long messages outside the exception class
(TRY003)
51-51: Avoid specifying long messages outside the exception class
(TRY003)
68-68: Avoid specifying long messages outside the exception class
(TRY003)
105-105: Avoid specifying long messages outside the exception class
(TRY003)
177-177: Avoid specifying long messages outside the exception class
(TRY003)
217-217: Avoid specifying long messages outside the exception class
(TRY003)
222-222: Consider moving this statement to an else block
(TRY300)
234-234: Consider moving this statement to an else block
(TRY300)
app/agents/voice/automatic/__init__.py
293-293: Consider moving this statement to an else block
(TRY300)
306-306: Create your own exception
(TRY002)
306-306: Avoid specifying long messages outside the exception class
(TRY003)
386-386: Consider moving this statement to an else block
(TRY300)
387-387: Do not catch blind exception: Exception
(BLE001)
424-424: Unused function argument: tts_provider
(ARG001)
515-515: Avoid specifying long messages outside the exception class
(TRY003)
534-534: Unused function argument: session_context
(ARG001)
| stt = get_stt_service( | ||
| voice_name=decode_voice_name(args.voice_name).value, | ||
| fallback_attempt=fallback_attempt, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fallback doesn’t advance on provider init failures; continue to next provider.
If get_stt_service fails (missing key, bad config), we should try the next provider instead of aborting the whole flow.
- # Get STT service for current fallback attempt
- stt = get_stt_service(
- voice_name=decode_voice_name(args.voice_name).value,
- fallback_attempt=fallback_attempt,
- )
+ # Get STT service for current fallback attempt
+ try:
+ stt = get_stt_service(
+ voice_name=decode_voice_name(args.voice_name).value,
+ fallback_attempt=fallback_attempt,
+ )
+ except Exception as e:
+ logger.error(
+ f"STT fallback: Failed to initialize provider for attempt #{fallback_attempt}: {e}"
+ )
+ continue # Try next provider
@@
- except Exception as e:
- logger.error(f"STT fallback: Non-STT error occurred: {e}")
- raise
+ except Exception as e:
+ logger.error(f"STT fallback: Non-STT error occurred: {e}")
+ raiseAlso applies to: 295-303
🤖 Prompt for AI Agents
In app/agents/voice/automatic/__init__.py around lines 283-286 (and similarly
295-303) the code calls get_stt_service directly and will abort the fallback
loop if provider initialization throws (missing key/bad config); wrap each
get_stt_service call in a try/except that catches initialization errors, logs
the provider name and error, and then continues the provider loop to try the
next provider (do not raise or exit); ensure any per-attempt counters
(fallback_attempt) are updated as before and that the exception path cleans
up/returns to the loop rather than terminating the whole flow.
| # Create session context for tools and services | ||
| session_context = create_session_context(args) | ||
|
|
||
| # Decode TTS configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wrong argument passed to create_session_context.
It expects a session_id string, not the entire args object.
- session_context = create_session_context(args)
+ session_context = create_session_context(args.session_id)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Create session context for tools and services | |
| session_context = create_session_context(args) | |
| # Decode TTS configuration | |
| # Create session context for tools and services | |
| session_context = create_session_context(args.session_id) | |
| # Decode TTS configuration |
🤖 Prompt for AI Agents
In app/agents/voice/automatic/__init__.py around lines 500 to 503,
create_session_context is being called with the entire args object but it
expects a session_id string; change the call to pass the session identifier
(e.g. args.session_id or the appropriate session id variable) instead of args so
the function receives the expected string; ensure the variable you pass is a
string and exists on args (add a fallback or validation if needed).
| # Add STT Error Detector for runtime fallback support | ||
| if enable_stt_fallback and restart_callback: | ||
| stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector") | ||
| pipeline_components.append(stt_error_detector) | ||
| logger.info("STT Error Detector enabled for runtime fallback") | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detector placement bug: STTErrorDetector must be after STT to see STT-originated ErrorFrames.
Placed before STT, it will miss STT errors; fallback never triggers. Move it right after STT.
- # Add STT Error Detector for runtime fallback support
- if enable_stt_fallback and restart_callback:
- stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector")
- pipeline_components.append(stt_error_detector)
- logger.info("STT Error Detector enabled for runtime fallback")
-
- # Add PTT VAD filter only if it's enabled
- if config.DISABLE_VAD_FOR_PTT:
- ptt_vad_filter = PTTVADFilter("PTTVADFilter")
- pipeline_components.append(ptt_vad_filter) # Filter VAD frames after STT
-
- pipeline_components.append(stt)
+ # Add PTT VAD filter only if it's enabled
+ if config.DISABLE_VAD_FOR_PTT:
+ ptt_vad_filter = PTTVADFilter("PTTVADFilter")
+ pipeline_components.append(ptt_vad_filter) # Filter VAD frames before STT
+
+ # STT service
+ pipeline_components.append(stt)
+
+ # Add STT Error Detector for runtime fallback support (must be after STT)
+ if enable_stt_fallback and restart_callback:
+ stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector")
+ pipeline_components.append(stt_error_detector)
+ logger.info("STT Error Detector enabled for runtime fallback")Also applies to: 657-657
🤖 Prompt for AI Agents
In app/agents/voice/automatic/__init__.py around lines 646-651 (and also apply
the same change at line 657), the STTErrorDetector is being appended before the
STT component so it cannot observe STT-originated ErrorFrames; move the
STTErrorDetector instantiation and pipeline_components.append call so they occur
immediately after the STT component is appended to pipeline_components (ensure
restart_callback check remains), and remove the earlier placement to guarantee
the detector sees STT output and triggers runtime fallback.
|
|
||
| # --- STT Fallback Configuration --- | ||
| ENABLE_STT_FALLBACK = ( | ||
| os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true" | ||
| ) # Enable automatic fallback to alternative STT providers on failure | ||
| STT_FALLBACK_PROVIDERS = [ | ||
| provider.strip().lower() | ||
| for provider in os.environ.get( | ||
| "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google" | ||
| ).split(",") | ||
| if provider.strip() | ||
| ] # Comma-separated list of fallback providers in priority order | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Harden fallback config: normalize, dedupe, validate providers (and include soniox by default).
Prevents duplicate/unknown providers and whitespace bugs; ensures primary isn’t retried in fallback. Also strips STT_PROVIDER.
Apply:
-# --- STT Fallback Configuration ---
-ENABLE_STT_FALLBACK = (
- os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true"
-) # Enable automatic fallback to alternative STT providers on failure
-STT_FALLBACK_PROVIDERS = [
- provider.strip().lower()
- for provider in os.environ.get(
- "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google"
- ).split(",")
- if provider.strip()
-] # Comma-separated list of fallback providers in priority order
+# --- STT Fallback Configuration ---
+ALLOWED_STT_PROVIDERS = {"google", "assemblyai", "openai", "deepgram", "soniox"}
+ENABLE_STT_FALLBACK = os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true"
+
+# Normalize primary
-STT_PROVIDER = os.environ.get(
- "STT_PROVIDER", "google"
-).lower() # "google", "assemblyai", "openai", "deepgram", or "soniox"
+STT_PROVIDER = os.environ.get("STT_PROVIDER", "google").strip().lower()
+
+_fallback_raw = os.environ.get(
+ "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google,soniox"
+)
+_fallback = [p.strip().lower() for p in _fallback_raw.split(",") if p.strip()]
+# keep order, drop dups/unknowns/primary
+seen = set()
+STT_FALLBACK_PROVIDERS = [
+ p for p in _fallback
+ if p in ALLOWED_STT_PROVIDERS and p != STT_PROVIDER and (p not in seen and not seen.add(p))
+]Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In app/core/config.py around lines 147 to 159, the STT fallback list needs to be
hardened: normalize (strip and lower), dedupe while preserving priority order,
validate entries against an allowed set (e.g.,
{"deepgram","openai","assemblyai","google","soniox"}) and ensure "soniox" is
included by default if not present; also remove the current primary STT_PROVIDER
(strip+lower) from the fallback list so we never retry the primary provider.
Implement this by reading STT_FALLBACK_PROVIDERS, splitting on commas, mapping
strip().lower(), filtering out empty strings and the normalized primary
provider, building a deduplicated ordered list (preserve first occurrence),
validating each against the allowed set and ignoring unknowns (optionally log or
warn), and finally ensuring "soniox" appears at the end or in default position
if absent.
stt fallback for providers integrated
Summary by CodeRabbit
New Features
Chores