- Make a custom Piper TTS model out of your own voice samples or any existing voice dataset
- Learn how to convert a public domain dataset into another voice using an RVC model
- Learn how to make custom datasets from audio clips and text transcripts
- Use the dataset recorder to make fun TTS clones of your family and friends
- Listen to your voice as training progresses in a convenient training environment
- Rapidly train custom TTS voices by finetuning pretrained checkpoint files
- Now runs Piper in a docker container for much more convenient installation
- Includes original resources for creating custom pronunciation rules.
- Includes original guides for using custom Piper voices with Home Assistant
- 100% free, runs 100% offline.
- https://www.tomshardware.com/raspberry-pi/add-any-voice-to-your-raspberry-pi-project-with-textymcspeechy
- https://www.hackster.io/news/erik-bjorgan-makes-voice-cloning-easy-with-the-applio-and-piper-based-textymcspeechy-e9bcef4246fb
February 24 2025 - fixed docs for manually editing voices to comply with Home Assistant's requirements
- My previous documentation of this process produced voices that worked in user scripts within Home Assistant, but I discovered that they would crash when used to create entities in
Settings
>Voice Assistants
if fields set in the.onnx.json
file differed even slightly from what was expected. - I have updated the docs to correct this issue.
- This should not impact voices trained with the latest version of TextyMcSpeechy.
- Voice models are now exported with filenames that comply with piper's naming convention (eg
en_US-bob_1234-medium.onnx
) .onnx.json
files now have fields set correctly when exported- These changes should make all models exported to
tts_dojo/tts_voices
usable in Home Assistant without modifications. - Fixed issues with menus when resuming sessions that were intially trained from scratch
- Training models from scratch (ie. without using pretrained checkpoint files) is now an option provided by
run_training.sh
. create_datasets.sh
now stores theespeak-ng
language identifier indataset.conf
so that there is no need to manually set a language during preprocessing.- the language code needed to build filenames that comply with Piper's naming convention is also stored in
dataset.conf
. - datasets created with earlier versions of TextyMcSpeechy will need to be updated:
create_datasets.sh <dataset_folder>
DATASETS/espeak_language_identifiers.txt
provides clear directions about which language codes to use when setting up a dataset.
- This brand new branch runs Piper in a docker container, which makes installation far, far, far, less painful.
- The scripts and docs in this branch have all been overhauled.
- The branch formerly known as
main
is now thenon-containerized
branch. It will be kept around for reference purposes but will not be maintained.
- Layout of tmux training environment can now be saved by selecting the control console and pressing
t
. This layout will be applied automatically on subsequent runs. - Custom pronunciation rules can now be defined in
tts_dojo/ESPEAK_RULES
. These can be applied automatically whenever thetextymcspeechy-piper
container launches viaESPEAK_RULES/automated_espeak_rules.sh
.
Read the quick start guide to learn how to build datasets and train models.
- Customizing pronunciation
- Using custom voices in Home Assistant
- Rendering custom voices for Home Assistant on a networked device with a GPU
- A NVIDIA GPU with drivers capable of running CUDA is required. Training on CPU, while technically possible, is not officially supported.
- A hard drive with sufficient storage capacity for the base installation (~15GB) and checkpoint files generated during training. 50gb of free space is suggested as a practical minimum.
- This project is written entirely in shell script and is primarily intended for Linux users. Windows users will need to use WSL to run it.
- Check for currently installed Nvidia driver by running
nvidia-smi
. If something like the image below shows up, you may be able to skip to step 3 - If Nvidia drivers are not installed on your system I recommend you do this using whatever "official" method exists for the distribution you are using. That's all the advice I can give you - in the past I have known the pain of spending hours repairing my OS after installing a driver I shouldn't have. If you survive this step continue to step 3.
- Check whether Docker is installed on your system by running
docker --version
. If it is installed skip to step 5. - You can install Docker using the instructions here: https://docs.docker.com/engine/install/
- You will need the NVIDIA Container Toolkit to enable GPU access within docker containers. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
- Clone this repo:
git clone https://github.com/domesticatedviking/TextyMcSpeechy
- To install packages, make scripts executable, choose the type of container you wish to run, and verify that needed tools are installed, from the
TextyMcSpeechy
directory, run
sudo bash setup.sh
- Setup is complete. If you chose to use the prebuilt container from dockerhub it will download automatically the first time you use the
run_container.sh
script or start to train a model. Take note that it's a 6GB download and over 10GB when decompressed. - Continue with the quick start guide to begin training models.
- The prebuilt docker container will install automatically - You don't need to download it. But if you want to anyway, run this:
docker image pull domesticatedviking/textymcspeechy-piper:latest
- To build your own image from the
Dockerfile
anddocker-compose.yml
in the mainTextyMcspeechy
directory, change to that directory and run:
docker compose build
-
Scripts are provided for launching the
textymcspeechy-piper
image, whether it is prebuilt or locally built.local_container_run.sh
launches images you have built yourself withDockerfile
anddocker-compose.yml
prebuilt_container_run.sh
launches a prebuilt image.run_container.sh
is a script that functions as an alias to one of the scripts above. It is called byrun_training.sh
to automatically bring the container up when training starts.stop_container.sh
will shut down thetextymcspeechy-piper
container if it is running.
-
Custom
espeak-ng
pronunciation rules can be defined intts_dojo/ESPEAK_RULES
. A guide for customizing pronunciation can be found here.