Skip to content

Conversation

@Swetha-160303
Copy link

@Swetha-160303 Swetha-160303 commented Dec 12, 2025

Summary by CodeRabbit

Release Notes

  • New Features
    • Introduced Fish AI as an available text-to-speech provider
    • Added configuration options for Fish AI including model selection, language settings, prosody controls, and output format customization

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

Walkthrough

This change introduces support for a new Fish AI text-to-speech provider. It adds the FISH provider enum member, creates dynamic configuration accessors for Fish-specific parameters, introduces the FISH_API_KEY static configuration, and implements Fish TTS service instantiation within the existing provider selection logic.

Changes

Cohort / File(s) Summary
TTS Provider Enum
app/ai/voice/agents/automatic/types/models.py
Added FISH enum member to TTSProvider enum
Fish TTS Configuration
app/core/config/static.py
Added FISH_API_KEY environment variable for API authentication
Fish TTS Dynamic Config
app/core/config/dynamic.py
Added 8 async configuration accessors for Fish TTS settings: reference_id, model_id, language_code, latency, prosody_speed, prosody_volume, normalize, and output_format
Fish TTS Service Integration
app/ai/voice/agents/automatic/tts/__init__.py
Implemented Fish provider branch in get_tts_service with config fetching, API key validation, language code mapping to enum with EN_IN fallback, and FishAudioTTSService instantiation

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Areas requiring extra attention:
    • Language code mapping logic and EN_IN fallback behavior in __init__.py
    • API key validation and error handling in the Fish provider branch
    • Consistency of parameter naming and defaults across config accessors in dynamic.py

Possibly related PRs

  • Integration of tts in Breeze Buddy #421: Modifies the same get_tts_service function in the TTS module; this PR adds a new Fish provider branch while that PR refactors service construction patterns—potential merge conflict or duplicate logic concerns.

Poem

🐟 A fish swims through config streams so fine,
With prosody, models, and language divine,
From EN_IN defaults to speech-1.5's call,
Our TTS takes flight with a voice for all! 🎤

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title 'Integration of Fish AI TTS' accurately reflects the main changes, which add support for Fish AI as a new TTS provider across multiple configuration and service files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01aed6c and b884b74.

📒 Files selected for processing (4)
  • app/ai/voice/agents/automatic/tts/__init__.py (4 hunks)
  • app/ai/voice/agents/automatic/types/models.py (1 hunks)
  • app/core/config/dynamic.py (1 hunks)
  • app/core/config/static.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
app/core/config/dynamic.py (1)
app/services/live_config/store.py (1)
  • get_config (211-235)
app/ai/voice/agents/automatic/tts/__init__.py (2)
app/core/config/dynamic.py (8)
  • FISH_TTS_LANGUAGE_CODE (95-97)
  • FISH_TTS_LATENCY (100-102)
  • FISH_TTS_MODEL_ID (90-92)
  • FISH_TTS_NORMALIZE (115-117)
  • FISH_TTS_OUTPUT_FORMAT (120-122)
  • FISH_TTS_PROSODY_SPEED (105-107)
  • FISH_TTS_PROSODY_VOLUME (110-112)
  • FISH_TTS_REFERENCE_ID (85-87)
app/ai/voice/agents/automatic/types/models.py (1)
  • TTSProvider (8-12)
🪛 Ruff (0.14.8)
app/ai/voice/agents/automatic/tts/__init__.py

107-107: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (8)
app/ai/voice/agents/automatic/types/models.py (1)

12-12: LGTM!

The new FISH enum member follows the established pattern of the existing TTS providers.

app/core/config/static.py (1)

84-86: LGTM!

The FISH_API_KEY configuration follows the same optional pattern as ELEVENLABS_API_KEY, with runtime validation in get_tts_service when the Fish provider is selected.

app/core/config/dynamic.py (2)

90-122: LGTM!

The remaining Fish TTS config accessors follow the established pattern from the Sarvam TTS configuration, with sensible defaults.


84-88: No action needed. The empty FISH_TTS_REFERENCE_ID default is safe and intentional—Fish Audio TTS API treats reference_id as optional, defaulting to a built-in voice when not provided or left empty.

Likely an incorrect or invalid review comment.

app/ai/voice/agents/automatic/tts/__init__.py (4)

4-4: LGTM!

The import of FishAudioTTSService from pipecat follows the established pattern for other TTS services.


104-108: API key validation looks good.

The validation pattern matches the Sarvam implementation. The static analysis hint about the long exception message (TRY003) is acceptable here for consistency with the existing code style at line 72.


119-130: LGTM!

The language enum conversion with proper error handling and fallback to Language.EN_IN is well implemented.


146-151: Good observability with comprehensive logging.

The log statement captures all relevant Fish TTS parameters for debugging and monitoring.

Comment on lines +132 to +152
service = FishAudioTTSService(
api_key=FISH_API_KEY,
reference_id=fish_tts_reference_id,
model_id=fish_tts_model_id,
output_format=fish_tts_output_format,
sample_rate=SAMPLE_RATE,
params=FishAudioTTSService.InputParams(
language=fish_language,
latency=fish_tts_latency,
normalize=fish_tts_normalize,
prosody_speed=fish_tts_prosody_speed,
prosody_volume=fish_tts_prosody_volume,
),
)
logger.info(
f"Using Fish AI TTS service with reference_id={fish_tts_reference_id}, "
f"model_id={fish_tts_model_id}, output_format={fish_tts_output_format}, "
f"sample_rate={SAMPLE_RATE}, language={fish_language}, "
f"speed={fish_tts_prosody_speed}, volume={fish_tts_prosody_volume}"
)
return service
Copy link

@coderabbitai coderabbitai bot Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate the FishAudioTTSService class definition
fd -e py -type f | xargs rg -l "class FishAudioTTSService"

Repository: juspay/clairvoyance

Length of output: 235


🏁 Script executed:

# Search for text_filters usage across TTS services
rg -n "text_filters" --type py

Repository: juspay/clairvoyance

Length of output: 402


🏁 Script executed:

# Check the actual instantiation of ElevenLabsTTSService and GoogleTTSService in the file
cat -n app/ai/voice/agents/automatic/tts/__init__.py | sed -n '150,220p'

Repository: juspay/clairvoyance

Length of output: 2537


🏁 Script executed:

# Search for FishAudioTTSService class definition
rg -n "class FishAudioTTSService" -A 20

Repository: juspay/clairvoyance

Length of output: 45


🏁 Script executed:

# Check where FishAudioTTSService is imported from
rg -n "from.*FishAudioTTSService|import.*FishAudioTTSService" --type py

Repository: juspay/clairvoyance

Length of output: 168


🏁 Script executed:

# Look at the beginning of the __init__.py file to see imports
head -30 app/ai/voice/agents/automatic/tts/__init__.py

Repository: juspay/clairvoyance

Length of output: 1038


🏁 Script executed:

# Check project dependencies to understand pipecat version
find . -name "requirements*.txt" -o -name "setup.py" -o -name "pyproject.toml" -o -name "Pipfile" | head -5

Repository: juspay/clairvoyance

Length of output: 104


🏁 Script executed:

# Look for requirements or setup files
rg "pipecat" --type py -g '*requirements*|setup*|pyproject*|Pipfile*' -B 2 -A 2

Repository: juspay/clairvoyance

Length of output: 211


🏁 Script executed:

# Search for any pipecat version specifications
rg "pipecat.*==" -m 5

Repository: juspay/clairvoyance

Length of output: 45


🏁 Script executed:

# Check requirements.txt for pipecat version
cat requirements.txt | grep -i pipecat

Repository: juspay/clairvoyance

Length of output: 224


🏁 Script executed:

# Also check requirements-dev.txt
cat requirements-dev.txt | grep -i pipecat

Repository: juspay/clairvoyance

Length of output: 45


🌐 Web query:

pipecat FishAudioTTSService text_filters parameter documentation

💡 Result:

The FishAudioTTSService accepts a text_filters parameter (type: Sequence[BaseTextFilter]) which is a list of text-filter objects applied after text aggregation; it transforms/cleans streaming text before TTS. A single text_filter argument also exists but is deprecated in favor of text_filters. See docs for the TTSService constructor and the FishAudio TTS implementation for details and examples of filters/aggregators. [1][2][3]

Sources:

  • TTSService API (text_filters / text_filter description). [1]
  • FishAudioTTSService module (Fish Audio TTS implementation). [2]
  • Pipecat text aggregators & filters overview (MarkdownTextFilter, BaseTextFilter usage). [3]

🌐 Web query:

pipecat fish tts service constructor parameters

💡 Result:

Constructor signature (keyword-only):

FishAudioTTSService(*, api_key: str, reference_id: str | None = None, model: str | None = None, model_id: str = 'speech-1.5', output_format: Literal['opus','mp3','pcm','wav'] = 'pcm', sample_rate: int | None = None, params: InputParams | None = None, **kwargs).

InputParams fields (keyword-only Pydantic model):

  • language: Language | None (default Language.EN).
  • latency: str | None ('normal' or 'balanced', default 'normal').
  • normalize: bool | None (default True).
  • prosody_speed: float | None (speech speed multiplier, default 1.0).
  • prosody_volume: int | None (dB adjustment, default 0).

Notes: "model" is deprecated in favor of reference_id; model_id selects Fish model (e.g., "speech-1.5"); output_format and sample_rate control audio encoding. [1][2]

Sources:
[1] Pipecat Fish TTS API docs (FishAudioTTSService & InputParams).
[2] Pipecat reference/source implementation for pipecat.services.fish.tts.


Add text_filters parameter to FishAudioTTSService instantiation for consistency.

FishAudioTTSService supports the text_filters parameter (via base TTSService class), but it is not currently passed at line 132-152. Both ElevenLabsTTSService (line 166) and GoogleTTSService (line 204) receive text_filters for text filtering; FishAudioTTSService should also be updated to accept it for feature parity.

🤖 Prompt for AI Agents
In app/ai/voice/agents/automatic/tts/__init__.py around lines 132 to 152, the
FishAudioTTSService instantiation omits the text_filters parameter causing
inconsistent behavior with other TTS services; update the constructor call to
pass text_filters (the same variable used for ElevenLabs and Google TTS) into
FishAudioTTSService so the service receives text filtering settings, and adjust
the InputParams or surrounding args if needed to match the parameter name
expected by the base TTSService.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't support the text filters

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Swetha-160303, thank you for clarifying! I understand that FishAudioTTSService doesn't support the text_filters parameter in practice. I've noted this for future reference.


✏️ Learnings added
Learnt from: Swetha-160303
Repo: juspay/clairvoyance PR: 424
File: app/ai/voice/agents/automatic/tts/__init__.py:132-152
Timestamp: 2025-12-12T10:56:50.984Z
Learning: In app/ai/voice/agents/automatic/tts/__init__.py, FishAudioTTSService does not support the text_filters parameter, unlike ElevenLabsTTSService and GoogleTTSService which do support it.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants