Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
-
Updated
Dec 26, 2024 - Python
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Kaldi-based Korean ASR (한국어 음성인식) open-source project
Time delay neural network (TDNN) implementation in Pytorch using unfold method
PyTorch implementation of the Factorized TDNN (TDNN-F) from "Semi-Orthogonal Low-Rank Matrix Factorization for Deep Neural Networks" and Kaldi
基于PaddlePaddle实现的音频分类,支持EcapaTdnn、PANNS、TDNN、Res2Net、ResNetSE等各种模型,还有多种预处理方法
Time Delayed NN implemented in pytorch
Deep Learning using Neural Network Toolbox + Finance Portfolio Selection with MorningStar
tdnn (time delay neural network) tensorflow implementation
This project partially embodies the state-of-the-art practices in speaker verification technology up until 2020, while attaining the state-of-the-art performance on the VoxCeleb1 test sets.
Developed a speech recognition system using TDNN, preprocessing audio, extracting MFCC features, and training the model. Fine-tuning with augmented data (19,000 rows) improved accuracy from 9% to 80% training and 40% validation. Data augmentation proved crucial for enhancing model performance and generalization. Still working to increase the acc.
Add a description, image, and links to the tdnn topic page so that developers can more easily learn about it.
To associate your repository with the tdnn topic, visit your repo's landing page and select "manage topics."