Skip to content

Conversation

@badri-singhal
Copy link
Contributor

@badri-singhal badri-singhal commented Dec 30, 2025

Summary by CodeRabbit

Release Notes

  • New Features

    • Added background audio support with configurable sound files and volume control for enhanced agent interactions
    • Extended Azure OpenAI configuration options for token limits and response temperature tuning
    • Improved transcription aggregation timing for better conversation flow
  • Configuration

    • New background sound settings (enable/disable, file selection, volume adjustment)
    • Enhanced dynamic configuration parameters for Azure OpenAI and transcription processing

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 30, 2025

Walkthrough

The PR adds background audio support to the Breeze Buddy agent by introducing template configuration options, a background sound mixer utility, and integrating the mixer into the WebSocket transport pipeline. Additionally, it extends Azure LLM service configuration, adds observability support via environment-aware observer integration, and introduces three new dynamic configuration accessors for LLM parameters.

Changes

Cohort / File(s) Summary
Background Audio Configuration & Types
app/ai/voice/agents/breeze_buddy/template/types.py, app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json
Introduces BackgroundSoundFile enum with OFFICE_AMBIENCE member; adds three configuration fields to ConfigurationModel: enable_background_sound, background_sound_file, background_sound_volume; template JSON updated with new config entries.
Background Audio Implementation
app/ai/voice/agents/breeze_buddy/utils/common.py
New create_background_sound_mixer() utility function validates template configuration, resolves background sound file path, confirms audio format compatibility, and constructs SoundfileMixer with specified volume; includes error handling and logging.
Agent Integration & Observability
app/ai/voice/agents/breeze_buddy/agent.py
Wires background sound mixer into FastAPIWebsocketParams; integrates environment-aware observer setup (conditionally enables MetricsLogObserver, LLMLogObserver, etc. in DEV mode); extends AzureLLMService with max_completion_tokens and temperature parameters; adds aggregation_timeout to LLMUserAggregatorParams; introduces observers support in PipelineParams.
Configuration & Dependencies
app/core/config/dynamic.py, requirements.txt
Adds three async configuration accessors (BREEZE_BUDDY_AZURE_MAX_COMPLETION_TOKENS, BREEZE_BUDDY_AZURE_TEMPERATURE, BREEZE_BUDDY_LLM_AGGREGATION_TIMEOUT) with Redis-backed defaults; adds soundfile dependency for audio processing.

Sequence Diagram(s)

sequenceDiagram
    participant Template as Template Config
    participant Agent as Breeze Buddy Agent
    participant Mixer as Background Mixer
    participant Pipeline as Pipeline
    participant Transport as WebSocket Transport
    participant Observers as Observers

    Template->>Agent: Load template config
    activate Agent
    
    Agent->>Mixer: create_background_sound_mixer()
    activate Mixer
    Mixer->>Mixer: Validate enable_background_sound
    Mixer->>Mixer: Resolve sound file path
    Mixer->>Mixer: Create SoundfileMixer<br/>with volume
    Mixer->>Agent: Return mixer instance
    deactivate Mixer
    
    Agent->>Transport: Configure with audio_out_mixer
    activate Transport
    Transport->>Transport: Wire mixer to output
    deactivate Transport
    
    Agent->>Agent: Check ENVIRONMENT (DEV mode)
    rect rgb(200, 220, 255)
    Note over Agent,Observers: Environment-aware observer setup
    Agent->>Observers: Instantiate observers<br/>(Metrics, LLM, Transcription, Latency, etc.)
    Observers->>Agent: Ready
    end
    
    Agent->>Pipeline: Create PipelineParams<br/>with observers, audio_out_mixer
    activate Pipeline
    Pipeline->>Pipeline: Initialize with config
    deactivate Pipeline
    
    deactivate Agent
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Enhance Template based architechture #427: Extends Breeze Buddy template and agent with audio-related functionality; both PRs add background sound mixer support and update agent imports/wiring in the same component.

Suggested reviewers

  • murdore

Poem

🐰 A mixer spins background dreams,
While observers watch the streams,
The template blooms with ambient grace,
And soundfiles find their rightful place—
Breeze Buddy hops to a richer beat! 🎵

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title refers to background ambience support and latency improvements, which aligns with the main changes in the PR (background audio mixer support and immediate transcription delivery).
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
app/ai/voice/agents/breeze_buddy/template/types.py (1)

26-28: LGTM! Consider adding validation for volume range.

The new background sound configuration fields are properly typed. However, background_sound_volume could benefit from validation to ensure it stays within a reasonable range (e.g., 0.0 to 5.0) to prevent audio clipping or muted output.

🔎 Optional: Add a Pydantic validator for volume range
+from pydantic import field_validator
+
 class ConfigurationModel(BaseModel):
     tts_voice_name: Optional[TTSVoiceName] = None
     stt_language: Optional[str] = None
     payload_based_language_selection: bool = False
     enable_background_sound: bool = False
     background_sound_file: Optional[str] = None
     background_sound_volume: float = 2.0
+
+    @field_validator("background_sound_volume")
+    @classmethod
+    def validate_volume(cls, v: float) -> float:
+        if v < 0.0 or v > 5.0:
+            raise ValueError("background_sound_volume must be between 0.0 and 5.0")
+        return v
requirements.txt (1)

52-52: Consider implementing consistent version pinning across requirements.

soundfile is unpinned on line 52, but this is part of a broader inconsistency in the file. Core dependencies like numpy, loguru, asyncpg, boto3, and pydub are also unpinned, while others like google-cloud-storage>=2.10.0 and fastapi==0.115.12 are pinned. A systematic approach to version pinning across critical dependencies would improve reproducibility.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ef71043 and 29f73c7.

⛔ Files ignored due to path filters (1)
  • app/ai/voice/agents/breeze_buddy/static/audio/office-ambience.mp3 is excluded by !**/*.mp3
📒 Files selected for processing (6)
  • app/ai/voice/agents/breeze_buddy/agent.py
  • app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json
  • app/ai/voice/agents/breeze_buddy/template/types.py
  • app/core/config/dynamic.py
  • app/core/config/static.py
  • requirements.txt
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-15T04:10:50.071Z
Learnt from: badri-singhal
Repo: juspay/clairvoyance PR: 313
File: app/api/routers/breeze_buddy.py:487-489
Timestamp: 2025-10-15T04:10:50.071Z
Learning: In the Breeze Buddy dashboard (`app/agents/voice/breeze_buddy/workflows/order_confirmation/dashboard.html`), analytics metrics are dynamically rendered using a for loop in the HTML script. When adding new metrics to the JSON response in the `get_analytics()` endpoint, no manual HTML template updates are required as the loop handles all metrics automatically.

Applied to files:

  • app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json
🧬 Code graph analysis (2)
app/core/config/dynamic.py (1)
app/services/live_config/store.py (1)
  • get_config (211-235)
app/ai/voice/agents/breeze_buddy/agent.py (1)
app/ai/voice/agents/breeze_buddy/template/context.py (2)
  • vad_analyzer (42-44)
  • transport (62-64)
🔇 Additional comments (7)
app/core/config/dynamic.py (1)

135-137: Verify the impact of disabling TTS preprocessing by default.

The default value for BB_SARVAM_TTS_ENABLE_PREPROCESSING changed from True to False. This behavioral change will disable TTS preprocessing for all deployments that don't explicitly set this config in Redis or environment variables.

Ensure this is the intended behavior and that existing deployments relying on the previous default have been updated accordingly.

app/core/config/static.py (2)

299-304: LGTM! New Azure LLM configuration parameters added.

The new AZURE_BREEZE_BUDDY_MAX_COMPLETION_TOKENS (default 50) and AZURE_BREEZE_BUDDY_TEMPERATURE (default 0.7) parameters are properly typed with int/float conversions and sensible defaults for voice agent responses.

Verify that max_completion_tokens=50 is sufficient for the expected response lengths in the Breeze Buddy voice agent. Short token limits may cause truncated responses in edge cases.


438-441: Verify the impact of enabling non-final tokens by default.

The default for BREEZE_BUDDY_SONIOX_ENABLE_NON_FINAL_TOKENS changed from "false" to "true". This enables interim/streaming transcription tokens, which provides real-time feedback but may also affect latency and behavior of downstream components that process transcription results.

Ensure existing deployments are prepared for this behavioral change.

app/ai/voice/agents/breeze_buddy/agent.py (3)

369-371: Setting aggregation_timeout=0.0 for immediate transcription delivery.

This change delivers transcription data to the LLM immediately without buffering. While this reduces latency, it may result in more fragmented transcription chunks being sent to the LLM, potentially affecting response quality for longer utterances.

Confirm that immediate transcription delivery (no aggregation timeout) works well with the Soniox STT provider and doesn't cause issues with partial/incomplete transcriptions being processed.


399-405: Multiple observers added for comprehensive observability.

Good addition of observability hooks for metrics, LLM activity, transcriptions, latency, and conversation flow tracking. These will be valuable for debugging and monitoring.


353-354: Azure LLM configured with new token and temperature parameters.

The LLM service now uses the configurable max_completion_tokens and temperature from environment variables, allowing runtime tuning of response behavior.

app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json (1)

40-43: No action needed. The background audio file office-ambience.mp3 exists at the expected path (app/ai/voice/agents/breeze_buddy/static/audio/office-ambience.mp3), and the agent properly handles missing files with a warning log rather than failing silently. The volume default of 2.0 is set intentionally in the ConfigurationModel.

Likely an incorrect or invalid review comment.

Comment on lines 273 to 324
# Create audio mixer for background sound from template configuration
audio_out_mixer = None
if (
template
and template.configurations
and template.configurations.enable_background_sound
):
background_sound_file = template.configurations.background_sound_file
background_sound_volume = template.configurations.background_sound_volume

if background_sound_file:
# Construct full path to audio file
full_audio_path = os.path.join(
os.path.dirname(__file__), "static", "audio", background_sound_file
)

if os.path.exists(full_audio_path):
audio_out_mixer = SoundfileMixer(
sound_files={"background": full_audio_path},
default_sound="background",
volume=background_sound_volume,
)
logger.info(
f"Background sound mixer enabled: file={background_sound_file}, "
f"volume={background_sound_volume}"
)
else:
logger.warning(
f"Background sound enabled but file not found: {full_audio_path}"
)
else:
logger.warning(
"Background sound enabled but no file specified in template"
)
else:
logger.info("Background sound mixer disabled (not configured in template)")
Copy link

@coderabbitai coderabbitai bot Dec 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full context of the agent.py file, especially around the audio_out_mixer usage
cd app/ai/voice/agents/breeze_buddy/ && wc -l agent.py

Repository: juspay/clairvoyance

Length of output: 75


🏁 Script executed:

# Read the relevant section and look at how audio_out_mixer is used after line 308
sed -n '273,330p' agent.py | cat -n

Repository: juspay/clairvoyance

Length of output: 114


🏁 Script executed:

# Find the import statement for SoundfileMixer
grep -n "SoundfileMixer" agent.py | head -5

Repository: juspay/clairvoyance

Length of output: 104


🏁 Script executed:

# Search for other usages of SoundfileMixer in the entire codebase
rg "SoundfileMixer" --type py -A 3 -B 1

Repository: juspay/clairvoyance

Length of output: 1042


🏁 Script executed:

# Read the section around where audio_out_mixer is created and used
sed -n '300,330p' app/ai/voice/agents/breeze_buddy/agent.py | cat -n

Repository: juspay/clairvoyance

Length of output: 1427


🏁 Script executed:

# Search for where audio_out_mixer is referenced/used in the file
rg "audio_out_mixer" app/ai/voice/agents/breeze_buddy/agent.py -n -B 2 -A 2

Repository: juspay/clairvoyance

Length of output: 727


🏁 Script executed:

# Search for other SoundfileMixer instantiation patterns in the codebase to see how it's typically used
rg "SoundfileMixer\(" --type py -A 5 -B 1

Repository: juspay/clairvoyance

Length of output: 694


🏁 Script executed:

# Check if there are any pipecat examples or documentation files about SoundfileMixer
find . -type f \( -name "*.md" -o -name "*.rst" -o -name "*.txt" \) -exec grep -l "SoundfileMixer\|audio_out_mixer" {} \;

Repository: juspay/clairvoyance

Length of output: 45


🌐 Web query:

pipecat SoundfileMixer audio_out_mixer automatic mixing

💡 Result:

Brief summary — pipecat SoundfileMixer & audio_out_mixer (automatic mixing):

  • SoundfileMixer mixes transport output with audio files (loads via python-soundfile), auto-resamples to the transport sample rate, requires mono files, supports multiple named files, looping, and runtime volume/sound switching. [1][2]
  • It implements the BaseAudioMixer interface (start/stop/process_frame/mix) and is passed to a transport via the transport param audio_out_mixer (can be a single BaseAudioMixer or a mapping per destination). The transport will call mix() for each outgoing audio chunk so mixing is automatic when configured. [3][2]
  • Runtime control: use MixerUpdateSettingsFrame (sound, volume, loop) and MixerEnableFrame to change sound/volume or enable/disable mixing at runtime. [1][2]
  • Notes & cautions: files must be mono; volume is realtime-adjustable; there have been reported issues (e.g. memory/blocking with websocket transport in some versions) — test on your target transport and pipecat version. [1][2][4]

If you want, I can show a minimal code example wiring SoundfileMixer into a transport and sending MixerUpdateSettingsFrame/MixerEnableFrame.


Background sound mixer integration is correct, but add audio file format validation.

The implementation properly integrates SoundfileMixer with the transport. When passed via the audio_out_mixer parameter to FastAPIWebsocketTransport, the mixer automatically mixes audio into the output stream—no explicit activation required. The configuration logic with file existence checks and fallback logging is well-structured.

However, SoundfileMixer requires mono audio files. The code should validate that background_sound_file is mono before passing it to the mixer, or document this requirement in the template configuration schema to prevent runtime failures with incompatible audio formats.

🤖 Prompt for AI Agents
In app/ai/voice/agents/breeze_buddy/agent.py around lines 273 to 308, add a
validation step to ensure the background_sound_file is mono before instantiating
SoundfileMixer: after constructing full_audio_path and confirming it exists,
open the file (e.g., via soundfile.info or wave module) to read its channel
count, and if channels != 1 log a warning that the file is not mono and skip
creating the mixer (leave audio_out_mixer as None); only create SoundfileMixer
when the channel count equals 1 so incompatible stereo/multi-channel files do
not cause runtime failures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@badri-singhal add this check

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@badri-singhal badri-singhal force-pushed the support-for-background-noise-during-call branch from 29f73c7 to 865de09 Compare January 1, 2026 06:50
@badri-singhal badri-singhal changed the title Support for background noise during call Support for template specific ambeience background with latency improvements Jan 1, 2026
@badri-singhal badri-singhal force-pushed the support-for-background-noise-during-call branch from 865de09 to 74e4145 Compare January 2, 2026 05:16
"stt_language": "en"
"stt_language": "en",
"enable_background_sound": false,
"background_sound_file": "office-ambience.mp3",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

)

# Create audio mixer for background sound from template configuration
audio_out_mixer = None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@badri-singhal badri-singhal force-pushed the support-for-background-noise-during-call branch from 74e4145 to ea79db3 Compare January 2, 2026 10:47
AZURE_BREEZE_BUDDY_OPENAI_MODEL = os.environ.get(
"AZURE_BREEZE_BUDDY_OPENAI_MODEL", "gpt-4o-automatic"
)
AZURE_BREEZE_BUDDY_MAX_COMPLETION_TOKENS = int(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.


def create_background_sound_mixer(
template, base_audio_path: str
) -> Optional[SoundfileMixer]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

)
BREEZE_BUDDY_SONIOX_ENABLE_NON_FINAL_TOKENS = (
os.environ.get("BREEZE_BUDDY_SONIOX_ENABLE_NON_FINAL_TOKENS", "false").lower()
os.environ.get("BREEZE_BUDDY_SONIOX_ENABLE_NON_FINAL_TOKENS", "true").lower()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

EXOTEL_SUBDOMAIN = os.getenv("EXOTEL_SUBDOMAIN", "api.exotel.com")
EXOTEL_APPLET_APP_ID = os.getenv("EXOTEL_APPLET_APP_ID", "1044183")
EXOTEL_TEMPLATE_APPLET_APP_ID = os.getenv("EXOTEL_TEMPLATE_APPLET_APP_ID", "")
EXOTEL_TEMPLATE_APPLET_APP_ID = os.getenv("EXOTEL_TEMPLATE_APPLET_APP_ID", "1147907")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@badri-singhal badri-singhal force-pushed the support-for-background-noise-during-call branch from ea79db3 to 9a38523 Compare January 2, 2026 12:18
@badri-singhal
Copy link
Contributor Author

@coderabbitai please review

@coderabbitai
Copy link

coderabbitai bot commented Jan 2, 2026

@badri-singhal: I'll review the changes in this PR for you.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
app/ai/voice/agents/breeze_buddy/utils/common.py (1)

362-362: Consider extracting hard-coded path as a module-level constant.

The constant BACKGROUND_SOUND_AUDIO_PATH is defined inside the function. For consistency and maintainability, consider moving it to the module level alongside other constants.

🔎 Proposed refactor

At the top of the file after imports:

+# Constants
+BACKGROUND_SOUND_AUDIO_PATH = "app/ai/voice/agents/breeze_buddy/static/audio"
+

 def indian_number_to_speech(number: int) -> str:

Then in the function:

 def create_background_sound_mixer(template) -> Optional[SoundfileMixer]:
     """
     Create a background sound mixer from template configuration.
 
     Args:
         template: Template object with configurations for background sound
 
     Returns:
         SoundfileMixer instance if successfully configured, None otherwise
     """
-    # Constant path for background sound audio files
-    BACKGROUND_SOUND_AUDIO_PATH = "app/ai/voice/agents/breeze_buddy/static/audio"
-
     # Check if background sound is enabled in template
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29f73c7 and 9a38523.

⛔ Files ignored due to path filters (1)
  • app/ai/voice/agents/breeze_buddy/static/audio/office-ambience.mp3 is excluded by !**/*.mp3
📒 Files selected for processing (6)
  • app/ai/voice/agents/breeze_buddy/agent.py
  • app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json
  • app/ai/voice/agents/breeze_buddy/template/types.py
  • app/ai/voice/agents/breeze_buddy/utils/common.py
  • app/core/config/dynamic.py
  • requirements.txt
🚧 Files skipped from review as they are similar to previous changes (3)
  • app/core/config/dynamic.py
  • app/ai/voice/agents/breeze_buddy/examples/templates/order-confirmation.json
  • requirements.txt
🧰 Additional context used
🧬 Code graph analysis (1)
app/ai/voice/agents/breeze_buddy/agent.py (3)
app/ai/voice/agents/breeze_buddy/utils/common.py (1)
  • create_background_sound_mixer (351-432)
app/ai/voice/agents/breeze_buddy/utils/language_utils/prompt_injections.py (1)
  • inject_language_rules (8-61)
app/core/config/dynamic.py (3)
  • BREEZE_BUDDY_AZURE_MAX_COMPLETION_TOKENS (152-154)
  • BREEZE_BUDDY_AZURE_TEMPERATURE (157-159)
  • BREEZE_BUDDY_LLM_AGGREGATION_TIMEOUT (162-164)
🪛 Ruff (0.14.10)
app/ai/voice/agents/breeze_buddy/utils/common.py

426-426: Consider moving this statement to an else block

(TRY300)


428-428: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (5)
app/ai/voice/agents/breeze_buddy/utils/common.py (1)

351-432: LGTM! Mono channel validation properly implemented.

The background sound mixer creation function is well-structured with comprehensive validation:

  • Configuration checks before file operations
  • File existence validation
  • Mono channel requirement enforcement (lines 407-414) as requested in past review
  • Graceful fallback with appropriate logging at each failure point

The mono channel validation successfully prevents runtime failures when incompatible audio files are configured.

app/ai/voice/agents/breeze_buddy/template/types.py (1)

22-25: LGTM! Clear enum definition.

The BackgroundSoundFile enum is well-defined with a descriptive docstring. Starting with a single value (OFFICE_AMBIENCE) is appropriate for the initial implementation.

app/ai/voice/agents/breeze_buddy/agent.py (3)

276-277: LGTM! Background mixer integration is correct.

The background sound mixer is created from the template configuration and will be automatically integrated by the transport when passed via audio_out_mixer parameter (line 286). The past review correctly confirmed that mono channel validation has been implemented in the create_background_sound_mixer function.


322-323: LGTM! Dynamic configuration properly integrated.

The Azure LLM service and aggregation parameters now use dynamic configuration accessors, allowing runtime adjustments without code changes. All async calls are properly awaited.

Also applies to: 339-339


362-376: LGTM! Environment-aware observer setup is appropriate.

The conditional observer setup ensures that detailed logging and metrics are only collected in development environments, which is the correct approach to:

  • Reduce overhead in production
  • Avoid excessive logging in production environments
  • Maintain observability during development and testing

The environment check for "dev" or "development" with clear logging is well-implemented.

Comment on lines +32 to +34
enable_background_sound: bool = False
background_sound_file: Optional[BackgroundSoundFile] = None
background_sound_volume: float = 2.0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for SoundfileMixer documentation and usage in the codebase
rg -i "soundfile.*mixer|mixer.*volume" -A 5 -B 2 --type=py | head -100

Repository: juspay/clairvoyance

Length of output: 2687


🏁 Script executed:

#!/bin/bash
# Look for volume configuration or documentation files
fd -type f -name "*.md" -o -name "*.py" | xargs rg -l "soundfile\|mixer" | head -20

Repository: juspay/clairvoyance

Length of output: 233


🌐 Web query:

pipecat SoundfileMixer volume parameter recommended values best practices

💡 Result:

Summary — recommended values and best practices for SoundfileMixer volume

  • Recommended default: 0.4 (library default). [1][2]
  • Allowed range: typically 0.0–1.0 (normalized scale), but the implementation accepts values >1.0 (example uses 2.0). Use >1.0 only if you intentionally need gain and you handle clipping. [1][2]
  • Loudness / measurement: use pipecat.audio.utils.calculate_audio_volume (EBU‑R128 based, returns 0..1) to measure and normalize file/stream loudness before mixing. [3]
  • Practical best practices:
    • Normalize sound files to a target loudness (use EBU R128 or calculate_audio_volume) so same perceived level across files. [3]
    • Keep music/ambience volume well below 1.0 when mixing with voice (start ~0.2–0.6) and adjust by ear or LUFS measurements to avoid masking speech. [1][3]
    • Avoid abrupt jumps: apply smoothing when changing volume (use exponential smoothing or gradually update via MixerUpdateSettingsFrame). [1][3]
    • Ensure sound files are mono and sample-rate matched; looping and runtime control available via MixerUpdateSettingsFrame / MixerEnableFrame. [1][2]
    • Monitor for clipping when using volume >1.0 and prefer pre‑processing gain rather than extreme runtime gain. [1][2][3]

Sources
[1] Pipecat SoundfileMixer docs (constructor, default volume, runtime frames, example using 2.0).
[2] Pipecat API reference for soundfile_mixer (volume documented, default 0.4, recommended range noted).
[3] pipecat.audio.utils.calculate_audio_volume (EBU R128 loudness normalization and 0..1 normalized output).


Reduce default volume of 2.0 to align with audio mixing best practices.

The volume default of 2.0 is too high for background sound mixing with voice. Pipecat's SoundfileMixer library default is 0.4, and recommended practice for background sound/music mixed with voice speech is 0.2–0.6 to avoid masking the primary audio. The current value would significantly overpower voice output. Change to a value in the 0.2–0.6 range (recommend 0.4 to match library default).

🤖 Prompt for AI Agents
In app/ai/voice/agents/breeze_buddy/template/types.py around lines 32 to 34, the
default background_sound_volume is set to 2.0 which is far above recommended
mixing levels; change the default to a value between 0.2 and 0.6 (recommended
0.4 to match Pipecat SoundfileMixer) so background audio doesn't overpower
voice—update the field default accordingly and run tests/quick manual audio
check to confirm acceptable mixing.

@swaroopvarma1 swaroopvarma1 merged commit ef48249 into juspay:release Jan 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants