Skip to content

Comments

feat(voice): add Voicebox TTS provider for local voice cloning#199

Open
penso wants to merge 3 commits intomainfrom
voicebox
Open

feat(voice): add Voicebox TTS provider for local voice cloning#199
penso wants to merge 3 commits intomainfrom
voicebox

Conversation

@penso
Copy link
Collaborator

@penso penso commented Feb 21, 2026

Summary

  • Add Voicebox as an opt-in TTS provider behind a voicebox cargo feature flag
  • Voicebox is a local Qwen3-TTS server with a FastAPI REST API for voice cloning
  • Two-step generation: POST /generate returns metadata, then GET /audio/{id} fetches WAV bytes
  • Feature flag chain: cli → gateway → voice, provider code gated with #[cfg(feature = "voicebox")]
  • Config schema always parsed (no feature gate), matching existing provider patterns

Files changed

File Change
crates/voice/src/tts/voicebox.rs New TTS provider implementation
crates/voice/Cargo.toml voicebox = [] feature
crates/voice/src/tts/mod.rs cfg-gated module + re-export
crates/voice/src/config.rs VoiceboxTtsConfig, TtsProviderId::Voicebox
crates/voice/src/lib.rs cfg-gated re-exports
crates/gateway/Cargo.toml voicebox feature forwarding
crates/gateway/src/voice.rs Provider creation, listing, config loading
crates/cli/Cargo.toml voicebox feature forwarding
crates/config/src/schema.rs VoiceVoiceboxTtsConfig struct
crates/config/src/validate.rs Schema map + semantic validation

Config example

[voice.tts]
provider = "voicebox"

[voice.tts.voicebox]
endpoint = "http://localhost:8000"
profile_id = "abc-123-def"
model_size = "1.7B"
language = "en"

Validation

Completed

  • cargo +nightly-2025-11-30 fmt --all -- --check
  • cargo clippy --workspace --all-targets (no warnings)
  • cargo clippy --workspace --all-targets --features voicebox (no warnings)
  • cargo check (default features)
  • cargo check --features voicebox
  • cargo test -p moltis-voice (124 passed)
  • cargo test -p moltis-voice --features voicebox (130 passed)
  • cargo test -p moltis-config (98 passed, schema drift guard OK)
  • cargo test -p moltis-gateway --features voicebox (488 passed)
  • taplo fmt

Remaining

  • ./scripts/local-validate.sh

Manual QA

  1. Enable feature: cargo build --features voicebox
  2. Start voicebox server: voicebox serve
  3. Set config: provider = "voicebox" in moltis.toml
  4. Test TTS synthesis through the web UI or API

Add Voicebox as an opt-in TTS provider behind a `voicebox` cargo feature
flag. Voicebox is a local Qwen3-TTS server with a FastAPI REST API for
voice cloning. Generation is two-step: POST /generate returns metadata
with a generation ID, then GET /audio/{id} fetches the WAV bytes.

Feature flag chain: cli → gateway → voice. Config is always parsed
(no feature gate on schema types) but provider implementation code
is gated with #[cfg(feature = "voicebox")].
@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Feb 21, 2026

Merging this PR will improve performance by 96.29%

⚡ 1 improved benchmark
✅ 38 untouched benchmarks
⏩ 5 skipped benchmarks1

Performance Changes

Benchmark BASE HEAD Efficiency
env_substitution 21.1 µs 10.7 µs +96.29%

Comparing voicebox (0dd5a08) with main (5f2c245)2

Open in CodSpeed

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

  2. No successful run was found on main (9804455) during the generation of this report, so 5f2c245 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@codecov
Copy link

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 84.02778% with 23 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/voice/src/tts/voicebox.rs 84.44% 14 Missing ⚠️
crates/gateway/src/voice.rs 72.22% 5 Missing ⚠️
crates/voice/src/config.rs 76.47% 4 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant