NI-MVI Semestra project 2022

Goal: Compare the quality of audio recording of Spanish speakers enhanced by CMGAN model. I will compare inferred data from :

Model pretrained on english speakers as provided by authors of the CMGAN.
Fine-tuned pretrained model with Spanish speakers.
Model trained from scratch using custom data.

Model: CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement, Sherif Abdulatif, Ruizhe Cao, Bin Yang

Paper: ArXiv
Implementation: GitHub
My training and testing environment: Google Colab
Custom training script: GitLab
Framework: PyTorch

Data: Spanish speaking audio recording in high quality (podcast quality). Data downloaded from YouTube. Both sexes - male and female. After preprocessing 50 minutes of data. Train:evaluation ratio is 40:10. Preprocessing consists in:

Tokenization: split data to short audio files, 3-8 seconds long.
Downsampling: recommended procedure in paper, 16kHz and 16 bits per sample.
Adding noise using the DEMAND dataset as recommended in the paper. List of data sources is in the sources.txt file. The preprocessed data are available on my university Google Drive, link in sources.txt.

Research: CMGAN is almost SoA. The successor SCP-CMGAN offers other metrics system which I did not understand so I chose the closest solution with available pretrained model and public dataset. paperswithcode.com

GAN training

Approach:

Preprocess custom data.
Train model from scratch using custom data.
Fine-tune existing pretrained model using custom data.
Evaluate models with PESQ and STOI metrics:
- Scratch.
- Fine-tuned.
- Pre-trained.
Compare results.
Prepare samples:
- Clean audio.
- Noisy audio.
- Enhanced with different models.

Final Report

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
losses		losses
research		research
samples		samples
src		src
README.md		README.md
graphs.ipynb		graphs.ipynb
preprocessing.ipynb		preprocessing.ipynb
report.pdf		report.pdf
sources.txt		sources.txt
test-mvi.png		test-mvi.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NI-MVI Semestra project 2022

About

Releases

Packages

Languages

bumbac/MVI

Folders and files

Latest commit

History

Repository files navigation

NI-MVI Semestra project 2022

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages