stt fallback #319

Devansh-1218 · 2025-10-17T05:36:35Z

stt fallback for providers integrated

Summary by CodeRabbit

New Features
- Added automatic fallback mechanism for voice transcription services. When the primary speech-to-text provider fails, the system now automatically switches to alternative providers for improved reliability and uptime.
Chores
- Enhanced voice agent error detection and recovery infrastructure.

coderabbitai · 2025-10-17T05:36:45Z

Walkthrough

The pull request introduces STT (speech-to-text) provider fallback capability to the voice agent system. It adds error detection in the pipeline, modularizes STT service creation with provider-specific factories, introduces new fallback-aware entrypoints, and extends configuration to enable and prioritize STT fallback providers.

Changes

Cohort / File(s)	Summary
Core Fallback Orchestration `app/agents/voice/automatic/__init__.py`	Introduces `STTRestartException`, adds `run_with_stt_fallback()` and `run_with_single_stt()` entrypoints, refactors `run_normal_mode()` to route through fallback flow when enabled, introduces `run_pipeline_with_stt()` and `build_and_run_pipeline()` to centralize pipeline construction with optional STT error detection, and adds helper setup functions (`setup_vad_analyzer()`, `setup_smart_turn_analyzer()`, `setup_audio_filter()`, `setup_transport_proxy()`, `create_llm_service()`, `setup_tools()`) to modularize component assembly.
STT Error Detection `app/agents/voice/automatic/processors/stt_error_detector.py`	Introduces `STTErrorDetector` processor class that intercepts pipeline frames, detects STT-related errors via keyword matching, triggers provided `restart_callback` to fallback to another provider, and halts error propagation downstream.
Provider Factory & Fallback Logic `app/agents/voice/automatic/stt/__init__.py`	Adds provider-agnostic `get_provider_for_attempt()` helper to map fallback attempts to providers, introduces dedicated factory functions (`create_assemblyai_service()`, `create_openai_service()`, `create_deepgram_service()`, `create_soniox_service()`, `create_google_service()`), adds `create_provider_service()` centralizer, and extends `get_stt_service()` with `fallback_attempt` parameter to support both fallback and legacy single-provider modes.
Fallback Configuration `app/core/config.py`	Adds `ENABLE_STT_FALLBACK` (boolean) and `STT_FALLBACK_PROVIDERS` (comma-separated list defaulting to deepgram, openai, assemblyai, google) to enable and prioritize STT fallback behavior.

Sequence Diagram

sequenceDiagram
    participant User
    participant Pipeline
    participant STTService
    participant ErrorDetector
    participant Agent
    participant FallbackHandler

    User->>Pipeline: Audio frame
    Pipeline->>STTService: Process STT request
    STTService-->>Pipeline: ErrorFrame (STT error)
    Pipeline->>ErrorDetector: Route ErrorFrame
    
    rect rgb(220, 53, 69, 0.1)
    Note over ErrorDetector: STT error detected
    ErrorDetector->>FallbackHandler: trigger restart_callback()
    end
    
    FallbackHandler->>Agent: Switch to next STT provider
    Agent->>STTService: Reinitialize with fallback provider
    Pipeline->>STTService: Retry audio processing
    STTService-->>Pipeline: STT result (success)
    Pipeline->>User: Response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

The changes introduce significant architectural modifications across multiple files with heterogeneous purposes: orchestration logic refactoring, new processor component, provider factory pattern implementation, and configuration extensions. While individual components are moderately complex, the interconnected nature of fallback flow wiring, error detection mechanisms, and provider factory abstractions demands careful review of the integration points and control flow across the pipeline.

Possibly related PRs

feat: soniox stt integration #234: Directly overlaps with STT provider factory modifications and configuration additions, including Soniox provider support in both PRs.
feat: Add deepgram stt support #225: Shares STT provider infrastructure refactoring and configuration changes, with prior Deepgram support that aligns with fallback provider setup.

Poem

🐰 From one provider, now we leap and bound,
When whispers fail, we try the next sound,
Error detection, fallback dance so swift,
Our voice pipeline gains a graceful gift! 🎤✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "stt fallback" directly corresponds to the primary objective of the changeset, which is implementing an STT (Speech-To-Text) fallback mechanism for provider providers. The title accurately identifies the core feature being added across multiple files, including new exception handling, error detection components, provider-agnostic selection logic, and configuration options. While the title is terse, it is specific and clear enough that a developer scanning the commit history would immediately understand this PR introduces STT fallback functionality. The title avoids vague language and provides a concise summary of the main change.
Docstring Coverage	✅ Passed	Docstring coverage is 95.83% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

app/agents/voice/automatic/__init__.py (1)
641-656: ptt_vad_filter may be undefined in message handlers; guard its usage.

If DISABLE_VAD_FOR_PTT is False, handlers will NameError on PTT events. Initialize to None and check before use.
-    stt_mute_filter = None
-    tool_call_processor = None
+    stt_mute_filter = None
+    tool_call_processor = None
+    ptt_vad_filter = None
@@
-                elif message_type == "ptt-start":
+                elif message_type == "ptt-start" and ptt_vad_filter:
@@
-                elif message_type == "ptt-end":
+                elif message_type == "ptt-end" and ptt_vad_filter:
@@
-                elif message_type == "ptt-sync":
+                elif message_type == "ptt-sync" and ptt_vad_filter:
Optional: else log “PTT message ignored; VAD filter disabled”.

Also applies to: 776-804
app/agents/voice/automatic/stt/__init__.py (1)

180-237: Verification confirms the review comment is accurate—callers need to handle init errors as fallback triggers.

Analysis of direct callers:

websocket_bot.py:238: No error handling; exceptions fatal (separate module context)

automatic/init.py:283 (fallback path): Catches only STTRestartException; ValueError from get_stt_service() falls through to generic handler and re-raises as fatal

automatic/init.py:312 (legacy path): No error handling; exceptions fatal

The fallback path at line 283 should catch init errors (like ValueError: "All STT providers exhausted") and treat them as "try next provider" rather than fatal. Currently, such exceptions hit the generic handler and propagate up, stopping the fallback loop prematurely.

The review comment is correct: keep get_stt_service() raising, but fix callers to catch and continue on init errors.

🧹 Nitpick comments (5)

app/agents/voice/automatic/stt/__init__.py (1)

16-34: Provider selection should skip duplicates and unknowns; compute ordered unique list once.

Prevents retrying the same provider and tolerates misconfigured lists.

-def get_provider_for_attempt(fallback_attempt: int) -> Optional[str]:
+def get_provider_for_attempt(fallback_attempt: int) -> Optional[str]:
     """
     Get the STT provider to use for the given fallback attempt.
@@
-    if fallback_attempt == 0:
-        return config.STT_PROVIDER
-
-    fallback_index = fallback_attempt - 1
-    if fallback_index < len(config.STT_FALLBACK_PROVIDERS):
-        return config.STT_FALLBACK_PROVIDERS[fallback_index]
-    return None
+    # Build ordered unique provider list: primary + fallbacks
+    allowed = {"google", "assemblyai", "openai", "deepgram", "soniox"}
+    ordered = []
+    primary = (config.STT_PROVIDER or "google").strip().lower()
+    if primary in allowed:
+        ordered.append(primary)
+    for p in config.STT_FALLBACK_PROVIDERS:
+        q = p.strip().lower()
+        if q in allowed and q not in ordered:
+            ordered.append(q)
+    return ordered[fallback_attempt] if 0 <= fallback_attempt < len(ordered) else None

app/agents/voice/automatic/processors/stt_error_detector.py (3)

25-37: Fix restart_callback type: it’s awaited; annotate as Awaitable.

Prevents type-checker drift and accidental sync callbacks.

-from typing import Callable
+from typing import Callable, Awaitable
@@
-    def __init__(
-        self, restart_callback: Callable[[], None], name: str = "STTErrorDetector"
-    ):
+    def __init__(
+        self, restart_callback: Callable[[], Awaitable[None]], name: str = "STTErrorDetector"
+    ):

39-57: Avoid leaking secrets in logs; redact sensitive substrings before logging.

Errors often include API keys/tokens. Redact before logger.error.

-            logger.error(f"STT error detected: {frame.error}")
+            msg = str(frame.error)
+            sanitized = msg.replace("api_key=", "api_key=***").replace("apikey=", "apikey=***")
+            logger.error(f"STT error detected: {sanitized}")

72-107: Keyword detection is broad; expect false positives. Prefer source metadata check when available.

If ErrorFrame exposes origin/service name, check that first; keep keywords as fallback.

-        # Check if error message contains any STT-related keywords
+        # Prefer structured source metadata if available
+        source = getattr(error_frame, "source", None)
+        if isinstance(source, str) and any(s in source.lower() for s in stt_services):
+            return True
+        # Fallback: keyword scan
         for keyword in stt_services + stt_error_patterns:

app/agents/voice/automatic/__init__.py (1)

865-879: Clean up pipeline/transport on STTRestartException to avoid leaks before retry.

Cancel the task (and close transport if supported) before bubbling up.

     async def run_pipeline():
         try:
             await runner.run(task)
         except asyncio.CancelledError:
             logger.info("Main task cancelled. Exiting gracefully.")
         except STTRestartException:
             # Re-raise STT restart exception to trigger fallback
             logger.info(
                 "STT restart exception detected - propagating to fallback handler"
             )
-            raise
+            try:
+                await task.cancel()
+            finally:
+                raise
         except Exception as e:
             logger.error(f"Pipeline runner error: {e}")
-            raise
+            try:
+                await task.cancel()
+            finally:
+                raise

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1dfeee1 and 980dad8.

📒 Files selected for processing (4)

app/agents/voice/automatic/__init__.py (11 hunks)
app/agents/voice/automatic/processors/stt_error_detector.py (1 hunks)
app/agents/voice/automatic/stt/__init__.py (2 hunks)
app/core/config.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

app/agents/voice/automatic/stt/__init__.py (1)

app/agents/voice/breeze_buddy/workflows/order_confirmation/utils.py (1)

get_stt_service (13-65)

app/agents/voice/automatic/__init__.py (8)

app/agents/voice/automatic/processors/stt_error_detector.py (1)

STTErrorDetector (17-107)

app/agents/voice/automatic/prompts/__init__.py (1)

get_system_prompt (26-59)

app/agents/voice/automatic/stt/__init__.py (1)

get_stt_service (180-237)

app/agents/voice/automatic/tools/__init__.py (1)

initialize_tools (19-135)

app/agents/voice/automatic/tts/__init__.py (1)

get_tts_service (15-88)

app/agents/voice/automatic/types/decoders.py (3)

decode_voice_name (14-21)

decode_tts_provider (4-11)

decode_mode (24-31)

app/agents/voice/automatic/features/llm_wrapper.py (1)

LLMServiceWrapper (14-142)

app/agents/voice/automatic/utils/session_context.py (2)

set_current_session_id (31-35)

create_session_context (22-24)

🪛 Ruff (0.14.0)

app/agents/voice/automatic/stt/__init__.py

39-39: Avoid specifying long messages outside the exception class

(TRY003)

51-51: Avoid specifying long messages outside the exception class

(TRY003)

68-68: Avoid specifying long messages outside the exception class

(TRY003)

105-105: Avoid specifying long messages outside the exception class

(TRY003)

177-177: Avoid specifying long messages outside the exception class

(TRY003)

217-217: Avoid specifying long messages outside the exception class

(TRY003)

222-222: Consider moving this statement to an else block

(TRY300)

234-234: Consider moving this statement to an else block

(TRY300)

app/agents/voice/automatic/__init__.py

293-293: Consider moving this statement to an else block

(TRY300)

306-306: Create your own exception

(TRY002)

306-306: Avoid specifying long messages outside the exception class

(TRY003)

386-386: Consider moving this statement to an else block

(TRY300)

387-387: Do not catch blind exception: Exception

(BLE001)

424-424: Unused function argument: tts_provider

(ARG001)

515-515: Avoid specifying long messages outside the exception class

(TRY003)

534-534: Unused function argument: session_context

(ARG001)

coderabbitai · 2025-10-17T05:44:02Z

app/agents/voice/automatic/__init__.py

+            stt = get_stt_service(
+                voice_name=decode_voice_name(args.voice_name).value,
+                fallback_attempt=fallback_attempt,
+            )


⚠️ Potential issue | 🔴 Critical

Fallback doesn’t advance on provider init failures; continue to next provider.

If get_stt_service fails (missing key, bad config), we should try the next provider instead of aborting the whole flow.

- # Get STT service for current fallback attempt - stt = get_stt_service( - voice_name=decode_voice_name(args.voice_name).value, - fallback_attempt=fallback_attempt, - ) + # Get STT service for current fallback attempt + try: + stt = get_stt_service( + voice_name=decode_voice_name(args.voice_name).value, + fallback_attempt=fallback_attempt, + ) + except Exception as e: + logger.error( + f"STT fallback: Failed to initialize provider for attempt #{fallback_attempt}: {e}" + ) + continue # Try next provider @@ - except Exception as e: - logger.error(f"STT fallback: Non-STT error occurred: {e}") - raise + except Exception as e: + logger.error(f"STT fallback: Non-STT error occurred: {e}") + raise

Also applies to: 295-303

🤖 Prompt for AI Agents

In app/agents/voice/automatic/__init__.py around lines 283-286 (and similarly 295-303) the code calls get_stt_service directly and will abort the fallback loop if provider initialization throws (missing key/bad config); wrap each get_stt_service call in a try/except that catches initialization errors, logs the provider name and error, and then continues the provider loop to try the next provider (do not raise or exit); ensure any per-attempt counters (fallback_attempt) are updated as before and that the exception path cleans up/returns to the loop rather than terminating the whole flow.

coderabbitai · 2025-10-17T05:44:02Z

app/agents/voice/automatic/__init__.py

+    # Create session context for tools and services
+    session_context = create_session_context(args)
+
+    # Decode TTS configuration


⚠️ Potential issue | 🟠 Major

Wrong argument passed to create_session_context.

It expects a session_id string, not the entire args object.

- session_context = create_session_context(args) + session_context = create_session_context(args.session_id)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Create session context for tools and services

session_context = create_session_context(args)

# Decode TTS configuration

# Create session context for tools and services

session_context = create_session_context(args.session_id)

# Decode TTS configuration

🤖 Prompt for AI Agents

In app/agents/voice/automatic/__init__.py around lines 500 to 503, create_session_context is being called with the entire args object but it expects a session_id string; change the call to pass the session identifier (e.g. args.session_id or the appropriate session id variable) instead of args so the function receives the expected string; ensure the variable you pass is a string and exists on args (add a fallback or validation if needed).

coderabbitai · 2025-10-17T05:44:03Z

app/agents/voice/automatic/__init__.py

+    # Add STT Error Detector for runtime fallback support
+    if enable_stt_fallback and restart_callback:
+        stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector")
+        pipeline_components.append(stt_error_detector)
+        logger.info("STT Error Detector enabled for runtime fallback")
+


⚠️ Potential issue | 🔴 Critical

Detector placement bug: STTErrorDetector must be after STT to see STT-originated ErrorFrames.

Placed before STT, it will miss STT errors; fallback never triggers. Move it right after STT.

- # Add STT Error Detector for runtime fallback support - if enable_stt_fallback and restart_callback: - stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector") - pipeline_components.append(stt_error_detector) - logger.info("STT Error Detector enabled for runtime fallback") - - # Add PTT VAD filter only if it's enabled - if config.DISABLE_VAD_FOR_PTT: - ptt_vad_filter = PTTVADFilter("PTTVADFilter") - pipeline_components.append(ptt_vad_filter) # Filter VAD frames after STT - - pipeline_components.append(stt) + # Add PTT VAD filter only if it's enabled + if config.DISABLE_VAD_FOR_PTT: + ptt_vad_filter = PTTVADFilter("PTTVADFilter") + pipeline_components.append(ptt_vad_filter) # Filter VAD frames before STT + + # STT service + pipeline_components.append(stt) + + # Add STT Error Detector for runtime fallback support (must be after STT) + if enable_stt_fallback and restart_callback: + stt_error_detector = STTErrorDetector(restart_callback, "STTErrorDetector") + pipeline_components.append(stt_error_detector) + logger.info("STT Error Detector enabled for runtime fallback")

Also applies to: 657-657

🤖 Prompt for AI Agents

In app/agents/voice/automatic/__init__.py around lines 646-651 (and also apply the same change at line 657), the STTErrorDetector is being appended before the STT component so it cannot observe STT-originated ErrorFrames; move the STTErrorDetector instantiation and pipeline_components.append call so they occur immediately after the STT component is appended to pipeline_components (ensure restart_callback check remains), and remove the earlier placement to guarantee the detector sees STT output and triggers runtime fallback.

coderabbitai · 2025-10-17T05:44:03Z

app/core/config.py

+
+# --- STT Fallback Configuration ---
+ENABLE_STT_FALLBACK = (
+    os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true"
+)  # Enable automatic fallback to alternative STT providers on failure
+STT_FALLBACK_PROVIDERS = [
+    provider.strip().lower()
+    for provider in os.environ.get(
+        "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google"
+    ).split(",")
+    if provider.strip()
+]  # Comma-separated list of fallback providers in priority order
+


🛠️ Refactor suggestion | 🟠 Major

Harden fallback config: normalize, dedupe, validate providers (and include soniox by default).

Prevents duplicate/unknown providers and whitespace bugs; ensures primary isn’t retried in fallback. Also strips STT_PROVIDER.

Apply:

-# --- STT Fallback Configuration --- -ENABLE_STT_FALLBACK = ( - os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true" -) # Enable automatic fallback to alternative STT providers on failure -STT_FALLBACK_PROVIDERS = [ - provider.strip().lower() - for provider in os.environ.get( - "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google" - ).split(",") - if provider.strip() -] # Comma-separated list of fallback providers in priority order +# --- STT Fallback Configuration --- +ALLOWED_STT_PROVIDERS = {"google", "assemblyai", "openai", "deepgram", "soniox"} +ENABLE_STT_FALLBACK = os.environ.get("ENABLE_STT_FALLBACK", "false").lower() == "true" + +# Normalize primary -STT_PROVIDER = os.environ.get( - "STT_PROVIDER", "google" -).lower() # "google", "assemblyai", "openai", "deepgram", or "soniox" +STT_PROVIDER = os.environ.get("STT_PROVIDER", "google").strip().lower() + +_fallback_raw = os.environ.get( + "STT_FALLBACK_PROVIDERS", "deepgram,openai,assemblyai,google,soniox" +) +_fallback = [p.strip().lower() for p in _fallback_raw.split(",") if p.strip()] +# keep order, drop dups/unknowns/primary +seen = set() +STT_FALLBACK_PROVIDERS = [ + p for p in _fallback + if p in ALLOWED_STT_PROVIDERS and p != STT_PROVIDER and (p not in seen and not seen.add(p)) +]

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In app/core/config.py around lines 147 to 159, the STT fallback list needs to be hardened: normalize (strip and lower), dedupe while preserving priority order, validate entries against an allowed set (e.g., {"deepgram","openai","assemblyai","google","soniox"}) and ensure "soniox" is included by default if not present; also remove the current primary STT_PROVIDER (strip+lower) from the fallback list so we never retry the primary provider. Implement this by reading STT_FALLBACK_PROVIDERS, splitting on commas, mapping strip().lower(), filtering out empty strings and the normalized primary provider, building a deduplicated ordered list (preserve first occurrence), validating each against the allowed set and ignoring unknowns (optionally log or warn), and finally ensuring "soniox" appears at the end or in default position if absent.

stt fallback

980dad8

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

stt fallback #319

stt fallback #319

Uh oh!

Devansh-1218 commented Oct 17, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 17, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

stt fallback #319

Are you sure you want to change the base?

stt fallback #319

Uh oh!

Conversation

Devansh-1218 commented Oct 17, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Devansh-1218 commented Oct 17, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 17, 2025 •

edited

Loading