TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
-
Updated
Sep 20, 2024 - TypeScript
TTS Generation Web UI (Bark, MusicGen + AudioGen, Tortoise, RVC, Vocos, Demucs, SeamlessM4T, MAGNet, StyleTTS2, MMS)
ONNX-compatible Fast SeamlessM4T—Massively Multilingual & Multimodal Machine Translation
(Windows/Linux) Local WebUI with neural network models (Text, Image, Video, 3D, Audio) on python (Gradio interface). Translated on 3 languages
SeamlessM4t-Translator: Utilizing the powerful Seamless M4t Facebook model in the backend, this project facilitates seamless translation functionalities including S2ST, S2TT, T2ST, and T2TT queries.
EchoSight is a tool that helps visually impaired individuals by audibly describing images taken with a Raspberry Pi Camera or inputted via image path or URL across different operating systems.
Automatic speech recognition (ASR)
How I used Seamless m4t large to get to the top 5 of the mozilla common voice competition hosted on Zindi
Just Run As It. Note: after install package, remember restart kernal
Translation from one language to another without speech intermediate
Add a description, image, and links to the seamlessm4t topic page so that developers can more easily learn about it.
To associate your repository with the seamlessm4t topic, visit your repo's landing page and select "manage topics."