ChocoTTS

ChocoTTS is a WebSocket-based interpreter for the TextToTalk plugin in Dalamud, enabling lifelike text-to-speech (TTS) and emotion inference from text. It uses the 🐸Coqui Ai TTS model for generating speech and j-hartmann's emotion transformer model for detecting emotions in text.

Features

Real-time TTS generation using Coqui Ai models, all generated locally
Emotion inference using j-hartmann's emotion transformer model
Caching of generated speech for faster repeat access
Adjustable audio playback volume
Support for multiple NPCs with different voice samples

Installation

The application is currently still under development, once a stable version 1.0 is ready and installer will be published.

Prerequisites

XIVLauncher (for dalamud)
TextToTalk (dalamud plugin that will provide us with a websocket server to parse text from)
Python 3.10 or higher
ffmpeg (for audio processing)
An NVIDIA GPU is highly recommended

License

This project is licensed under the GNU General Public License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
chocotts		chocotts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChocoTTS

Features

Installation

Prerequisites

License

About

Releases

Packages

Languages

License

J3sven/ChocoTTS

Folders and files

Latest commit

History

Repository files navigation

ChocoTTS

Features

Installation

Prerequisites

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages