asr
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
Whisper realtime streaming for long speech-to-text transcription and translation
Robust Speech Recognition via Large-Scale Weak Supervision
Port of OpenAI's Whisper model in C/C++
Open source real-time translation app for Android that runs locally
Efficient Inference of Transformer models
Foundational Models for State-of-the-Art Speech and Text Translation
kaldi-asr/kaldi is the official location of the Kaldi project.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Multilingual Voice Understanding Model