Skip to content

An Ai driven WebSocket interpreter for Dalamud's TextToTalk addon that leverages Coqui Ai TTS for lifelike audio and j-hartmann's emotion transformer model to infer emotion from text.

License

Notifications You must be signed in to change notification settings

J3sven/ChocoTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChocoTTS

ChocoTTS is a WebSocket-based interpreter for the TextToTalk plugin in Dalamud, enabling lifelike text-to-speech (TTS) and emotion inference from text. It uses the 🐸Coqui Ai TTS model for generating speech and j-hartmann's emotion transformer model for detecting emotions in text.

Features

  • Real-time TTS generation using Coqui Ai models, all generated locally
  • Emotion inference using j-hartmann's emotion transformer model
  • Caching of generated speech for faster repeat access
  • Adjustable audio playback volume
  • Support for multiple NPCs with different voice samples

Installation

The application is currently still under development, once a stable version 1.0 is ready and installer will be published.

Prerequisites

  • XIVLauncher (for dalamud)
  • TextToTalk (dalamud plugin that will provide us with a websocket server to parse text from)
  • Python 3.10 or higher
  • ffmpeg (for audio processing)
  • An NVIDIA GPU is highly recommended

License

This project is licensed under the GNU General Public License. See the LICENSE file for more details.

About

An Ai driven WebSocket interpreter for Dalamud's TextToTalk addon that leverages Coqui Ai TTS for lifelike audio and j-hartmann's emotion transformer model to infer emotion from text.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages