Skip to content

Releases: caimari/vtts

v0.1.0 — First release: Continuous batching for TTS

15 Mar 19:23

Choose a tag to compare

First public release of vTTS.

Key results on a single RTX 3060 (12GB):

  • 10 simultaneous voices, zero extra VRAM
  • 192ms time-to-first-byte (RTX 3090 Ti) / 254ms (RTX 3060)
  • 3.6 audio seconds per wall second with 5 users
  • Dynamic join/leave: new requests enter mid-generation

Why not multiple processes? 3 separate processes = 22GB VRAM for 3 users. vTTS = 3.4GB for 10+ users.

Includes M1 (batch epochs) and M2 (continuous batching) modes.
Supports Qwen3-TTS models (0.6B and 1.7B) with built-in speakers and voice cloning.