Neural network based TTS Engine.
If you want to just play around with the TTS, this works as stand-alone.
python3 glados-tts/glados.py
the TTS Engine can also be used remotely on a machine more powerful then the Pi to process in house TTS: (executed from glados-tts directory
python3 engine-remote.py
Default port is 8124 Be sure to update settings.env variable in your main Glados-voice-assistant directory:
TTS_ENGINE_API = http://192.168.1.3:8124/synthesize/
The Tacotron and ForwardTacotron models were trained as multispeaker models on two datasets separated into three speakers. LJSpeech (13,100 lines), and then on the heavily modified version of the Ellen McClain dataset, separated into Portal 1 and 2 voices (with punctuation and corrections added manually). The lines from the end of Portal 1 after the cores get knocked off were counted as Portal 2 lines.
The initial, regular Tacotron model was trained first on LJSpeech, and then on a heavily modified version of the Ellen McClain dataset (all non-Portal 2 voice lines removed, punctuation added).
- The Forward Tacotron model was only trained on about 600 voice lines.
- The HiFiGAN model was generated through transfer learning from the sample.
- All models have been optimized and quantized.
If you want to install the TTS Engine on your machine, please follow the steps below.
- Download the model files from
Google Drive
and unzip into the repo folder - Install the required Python packages, e.g., by running
pip install -r requirements.txt