██████╗ ███████╗███╗ ██╗██╗███████╗ ██╔════╝ ██╔════╝████╗ ██║██║██╔════╝ ██║ ███╗█████╗ ██╔██╗ ██║██║█████╗ ██║ ██║██╔══╝ ██║╚██╗██║██║██╔══╝ ╚██████╔╝███████╗██║ ╚████║██║███████╗ ╚═════╝ ╚══════╝╚═╝ ╚═══╝╚═╝╚══════╝
🔮 GENIE: GPT-SoVITS Lightweight Inference Engine
Experience near-instantaneous speech synthesis on your CPU
GENIE is a lightweight inference engine built on the open-source TTS project GPT-SoVITS. It integrates TTS inference, ONNX model conversion, API server, and other core features, aiming to provide ultimate performance and convenience.
- ✅ Supported Model Version: GPT-SoVITS V2
- ✅ Supported Language: Japanese
- ✅ Supported Python Version: >= 3.9
GENIE optimizes the original model for outstanding CPU performance.
Feature | 🔮 GENIE | Official PyTorch Model | Official ONNX Model |
---|---|---|---|
First Inference Latency | 1.13s | 1.35s | 3.57s |
Runtime Size | ~200MB | ~several GB | Similar to GENIE |
Model Size | ~230MB | Similar to GENIE | ~750MB |
📝 Note: Since GPU inference latency does not significantly improve over CPU for the first packet, we currently only provide a CPU version to ensure the best out-of-the-box experience.
📝 Latency Test Info: All latency data is based on a test set of 100 Japanese sentences (~20 characters each), averaged. Tested on CPU i7-13620H.
⚠️ Important: It is recommended to run GENIE in Administrator mode to avoid potential performance degradation.
Install via pip:
pip install genie-tts
📝 You may encounter an installation failure when trying to install pyopenjtalk. This is because pyopenjtalk is a library that includes C extensions, and the publisher does not currently provide pre-compiled binary packages ( wheels). For Windows users, this requires installing Visual Studio Build Tools. Specifically, you must select the "Desktop development with C++" workload during the installation process.
No GPT-SoVITS model yet? No problem! GENIE includes predefined speaker characters for immediate use without any model files. Run the code below to hear it in action:
import genie_tts as genie
import time
# Automatically downloads required files on first run
genie.load_predefined_character('misono_mika')
genie.tts(
character_name='misono_mika',
text='どうしようかな……やっぱりやりたいかも……!',
play=True, # Play the generated audio directly
)
time.sleep(10) # Add delay to ensure audio playback completes
A simple TTS inference example:
import genie_tts as genie
# Step 1: Load character voice model
genie.load_character(
character_name='<CHARACTER_NAME>', # Replace with your character name
onnx_model_dir=r"<PATH_TO_CHARACTER_ONNX_MODEL_DIR>", # Folder containing ONNX model
)
# Step 2: Set reference audio (for emotion and intonation cloning)
genie.set_reference_audio(
character_name='<CHARACTER_NAME>', # Must match loaded character name
audio_path=r"<PATH_TO_REFERENCE_AUDIO>", # Path to reference audio
audio_text="<REFERENCE_AUDIO_TEXT>", # Corresponding text
)
# Step 3: Run TTS inference and generate audio
genie.tts(
character_name='<CHARACTER_NAME>', # Must match loaded character
text="<TEXT_TO_SYNTHESIZE>", # Text to synthesize
play=True, # Play audio directly
save_path="<OUTPUT_AUDIO_PATH>", # Output audio file path
)
print("🎉 Audio generation complete!")
To convert original GPT-SoVITS models for GENIE, ensure torch
is installed:
pip install torch
Use the built-in conversion tool:
Tip:
convert_to_onnx
currently supports only V2 models.
import genie_tts as genie
genie.convert_to_onnx(
torch_pth_path=r"<YOUR .PTH MODEL FILE>", # Replace with your .pth file
torch_ckpt_path=r"<YOUR .CKPT CHECKPOINT FILE>", # Replace with your .ckpt file
output_dir=r"<ONNX MODEL OUTPUT DIRECTORY>" # Directory to save ONNX model
)
GENIE includes a lightweight FastAPI server:
import genie_tts as genie
# Start server
genie.start_server(
host="0.0.0.0", # Host address
port=8000, # Port
workers=1 # Number of workers
)
For request formats and API details, see our API Server Tutorial.
GENIE provides a simple command-line client for quick testing and interactive use:
import genie_tts as genie
# Launch CLI client
genie.launch_command_line_client()
-
🌐 Language Expansion
- Add support for Chinese and English.
-
🚀 Model Compatibility
- Support for
V2Proplus
,V3
,V4
, and more.
- Support for
-
📦 Easy Deployment
- Release Docker images.
- Provide out-of-the-box Windows / Linux bundles.