High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.
-
Updated
Jul 5, 2025 - Python
High-performance Text-to-Speech server with OpenAI-compatible API, 8 voices, emotion tags, and modern web UI. Optimized for RTX GPUs.
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
🎙️ Speak with AI - Run locally using Ollama, OpenAI, Anthropic or xAI - Speech uses XTTS, OpenAI, ElevenLabs or Kokoro
ComfyUI custom node for the VibeVoice TTS. Expressive, long-form, multi-speaker conversational audio
Eleven Labs text to speech package for NodeJS. You can use the official package at: https://www.npmjs.com/package/elevenlabs
🦆💰 A bot that uses Uberduck (and FakeYou) AI to make bit donations have an AI voice.
Преобразование голоса на основе VITS. Ориентировано на простоту, качество и производительность.
Voice AI agent for reactivating cold leads through personalized calls, assessing their interest with AI agents, and syncing insights directly to your CRMs.
A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.
Beautiful voice app: record or upload to train a voice, generate speech from text or files, save & download voices.
AI Voice Translator is a Python app that transcribes spoken English, translates it into six languages, and generates voice translations using AssemblyAI and ElevenLabs. It features an interactive Gradio UI for real-time translation.
AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧
IDVoice + ChatGPT iOS demo app
The official Python API for Revocalize AI voice synthesizer platform.
A voice-based AI chat interface built with Next.js and ElevenLabs. Start and stop real-time conversations with an animated UI that reflects agent status. Fully responsive and deployable via Vercel with environment-based agent configuration.
a GUI voice assistant with python that uses GPT or llama 🧶
Says everything you type in discord for you using ai (Silero Models)
Run XTTS within Docker/Podman for voice fine-tuning in Gradio's Web UI
Video Summarization & Audio Synchronization Tool for resume video ( long mp4 to short mp4 or Mp4 to text )
Add a description, image, and links to the ai-voice topic page so that developers can more easily learn about it.
To associate your repository with the ai-voice topic, visit your repo's landing page and select "manage topics."