Local Qwen3 VoiceDesign TTS service built with FastAPI.
- Working end-to-end via
curl - Working end-to-end from a separate Swift CLI client
- Current output format:
audio/wav(PCM16 mono 24kHz)
uv run python main.pyIn another terminal:
mkdir -p outputs
curl -X POST http://127.0.0.1:8000/model/load \
-H "Content-Type: application/json" \
-d '{"mode":"voice_design","strict_load":false}'
curl -X POST http://127.0.0.1:8000/synthesize/voice-design \
-H "Content-Type: application/json" \
-d '{"text":"Hello from TalkToMePy demo.","instruct":"Warm and clear narrator voice.","language":"English","format":"wav"}' \
--output outputs/demo.wav
afplay outputs/demo.wav- Python
>=3.13 uvsoxon PATH (macOS:brew install sox)
./scripts/setup.shThis script:
- checks for
uvandsox - runs
uv sync - creates
outputs/ - creates
.env.launchdfrom.env.exampleif missing
Single-instance defaults:
cp .env.example .env.launchdInstance-specific launchd env files are also supported:
.env.launchd.stable.env.launchd.dev
scripts/run_service.sh accepts an env file path argument, and scripts/launchd_instance.sh wires this automatically.
uv run python main.pyService URL: http://127.0.0.1:8000
FastAPI exposes live docs/spec automatically:
- OpenAPI JSON:
http://127.0.0.1:8000/openapi.json - Swagger UI:
http://127.0.0.1:8000/docs - ReDoc:
http://127.0.0.1:8000/redoc
This repo includes separate target and generated YAML specs:
- Target spec (do not overwrite):
openapi/openapi.yaml - Backup copy of target spec:
openapi/openapi.target.yaml - Generated export from app OpenAPI schema:
openapi/openapi.generated.yaml
Regenerate the generated spec after API changes:
uv run python scripts/export_openapi.pyCheck parity between target and generated specs:
diff -u openapi/openapi.yaml openapi/openapi.generated.yamlRun the parity gate test:
uv run python scripts/export_openapi.py
uv run pytest -q tests/test_openapi_parity.pyGitHub Actions runs two lanes:
pytest(required fast lane): lockfile sync + OpenAPI export + full pytest suite.smoke-e2e(separate model-backed lane onmainpush, nightly schedule, and manual dispatch):- direct model smoke:
scripts/voice_design_smoke.py - API e2e custom voice:
scripts/custom_voice_smoke.py - API e2e voice clone:
scripts/voice_clone_smoke.py
- direct model smoke:
smoke-e2e requires repository secret HF_TOKEN and uploads smoke artifacts from outputs/.
This repo includes:
- LaunchAgent template:
launchd/com.talktomepy.plist - Runner script:
scripts/run_service.sh - Instance manager:
scripts/launchd_instance.sh
Install and start a single instance:
./scripts/launchd_instance.sh install --instance stable --port 8000Manage it:
./scripts/launchd_instance.sh status --instance stable
./scripts/launchd_instance.sh logs --instance stable
./scripts/launchd_instance.sh restart --instance stable
./scripts/launchd_instance.sh stop --instance stable
./scripts/launchd_instance.sh remove --instance stableInstall from each clone with a different instance name and port:
Stable clone (~/Workspace/services/talkToMePy):
cd ~/Workspace/services/talkToMePy
./scripts/launchd_instance.sh install --instance stable --port 8000Dev clone (~/Workspace/talkToMePy):
cd ~/Workspace/talkToMePy
./scripts/launchd_instance.sh install --instance dev --port 8001Health check both:
curl http://127.0.0.1:8000/health
curl http://127.0.0.1:8001/healthNotes for modern macOS (including macOS 26):
- Prefer
bootstrap/bootout/kickstartover legacyload/unload. launchdhas a minimal environment; keep required env vars inscripts/run_service.shand per-instance env files.scripts/run_service.shsets a Homebrew-friendly defaultPATHsosoxis resolvable under launchd.
GET /healthreturns service statusGET /versionreturns API/service version metadataGET /adapterslists available runtime adaptersGET /adapters/{adapter_id}/statusreturns adapter-specific statusGET /model/statusreturns mode-aware model runtime readiness/statusGET /model/inventoryreturns supported model inventory and local availabilityPOST /model/loadaccepts mode-aware load request and lazily loads selected modelPOST /model/unloadunloads the model from memoryGET /custom-voice/speakersreturns supported custom-voice speakers for selected modelPOST /synthesize/voice-designreturns generated audio bytes asaudio/wavPOST /synthesize/custom-voicereturns generated audio bytes asaudio/wavPOST /synthesize/voice-clonereturns generated audio bytes asaudio/wav
Notes:
POST /model/loadmay return202 Acceptedwhile loading is in progress.- Synth routes return
503withRetry-Afterif model is still loading. - Legacy
POST /synthesizeandPOST /synthesize/streamwere removed in v0.5.0.
curl http://127.0.0.1:8000/healthcurl http://127.0.0.1:8000/versioncurl http://127.0.0.1:8000/adapterscurl http://127.0.0.1:8000/adapters/qwen3-tts/statuscurl http://127.0.0.1:8000/model/statuscurl http://127.0.0.1:8000/model/inventorycurl -X POST http://127.0.0.1:8000/model/load \
-H "Content-Type: application/json" \
-d '{"mode":"voice_design","strict_load":false}'curl -X POST http://127.0.0.1:8000/model/unloadcurl -X POST http://127.0.0.1:8000/synthesize/voice-design \
-H "Content-Type: application/json" \
-d '{"text":"Hello from Swift bridge!","instruct":"Warm and friendly voice with steady pace.","language":"English","format":"wav"}' \
--output outputs/from_service.wavcurl -X POST http://127.0.0.1:8000/synthesize/custom-voice \
-H "Content-Type: application/json" \
-d '{"text":"Custom voice endpoint test.","speaker":"ryan","language":"English","format":"wav"}' \
--output outputs/from_custom.wavcurl -X POST http://127.0.0.1:8000/synthesize/voice-clone \
-H "Content-Type: application/json" \
-d '{"text":"Voice clone endpoint test.","reference_audio_b64":"UklGRg==","language":"English","format":"wav"}' \
--output outputs/from_clone.wavPlay the generated file on macOS:
afplay outputs/from_service.wavuv run python scripts/voice_design_smoke.py \
--text "Hello from my Swift CLI bridge." \
--instruct "Energetic, friendly, and slightly brisk pacing with bright tone." \
--output outputs/swift_bridge_demo.wavqwen-ttscurrently requirestransformers==4.57.3(pinned in this repo).- All synth endpoints currently support
format: "wav"only. - Model id can be overridden with env var
QWEN_TTS_MODEL_ID. - Optional idle auto-unload can be enabled with env var
QWEN_TTS_IDLE_UNLOAD_SECONDS. - Optional startup warm-load can be enabled with env var
QWEN_TTS_WARM_LOAD_ON_START=true. - Optional load settings:
QWEN_TTS_DEVICE_MAP,QWEN_TTS_TORCH_DTYPE. - When
QWEN_TTS_DEVICE_MAPis unset orauto, a synthesis meta-tensor runtime failure now triggers one automatic reload/retry on CPU (device_map=cpu,torch_dtype=float32).
Roadmap and TODO tracking live in ROADMAP.md.