Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
- 
            Updated
            Oct 23, 2025 
- Jupyter Notebook
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A python package to build AI-powered real-time audio applications
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
Speaker embedding (d-vector) trained with GE2E loss
Keras implementation of ‘’Deep Speaker: an End-to-End Neural Speaker Embedding System‘’ (speaker recognition)
A pipeline to read lips and generate speech for the read content, i.e Lip to Speech Synthesis.
PyTorch implementation of Densely Connected Time Delay Neural Network
Companion repository for the paper "A Comparison of Metric Learning Loss Functions for End-to-End Speaker Verification" published at SLSP 2020
A curated list of speaker-embedding speaker-verification, speaker-identification resources.
Voxceleb1 i-vector based speaker recognition system
Luigi pipeline to download VoxCeleb(2) audio from YouTube and extract speaker segments
On-device speaker recognition engine powered by deep learning
Trained speaker embedding deep learning models and evaluation pipelines in pytorch and tesorflow for speaker recognition.
PyTorch implementation of the 1D-Triplet-CNN neural network model described in Fusing MFCC and LPC Features using 1D Triplet CNN for Speaker Recognition in Severely Degraded Audio Signals by A. Chowdhury, and A. Ross.
Speaker embedding for VI-SVC and VI-SVS, alse for VITS; Use this to replace the ID to implement voice clone.
DropClass and DropAdapt - repository for the paper accepted to Speaker Odyssey 2020
Awesome Speech Dataset, including download links and a brief explanation for each resource. These datasets provide diverse and high-quality speech data covering various domains such as conversational, academic, political, and more.
Official implementation of the ICASSP 2024 paper: Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Speaker Verification
A curated list of awesome speaker recognition/verification papers, projects, datasets, and competition.
Create speaker voiceprints from a few seconds of audio. And, identify individuals in real-time streaming or recorded conversations.
Add a description, image, and links to the speaker-embedding topic page so that developers can more easily learn about it.
To associate your repository with the speaker-embedding topic, visit your repo's landing page and select "manage topics."