This repository contains the code to run experiments and evaluations described in the thesis Investigating Effects of Data Quantityand Augmentation Methods for Speech-To-Text.
Experiments depend on ESPnet. To run the experiments, follow the installation instructions. I recommend using the docker container provided by ESPnet.
After installing ESPnet, copy files from the asr1
folder to the commonvoice
asr recipe folder.
cp -r asr1/. <espnet-dir>/egs/commonvoice/asr1/
To run experiments, execute the run.sh
.
cd <espnet-dir>/egs/commonvoice/asr1/
./run.sh > log 2> err
Note: You may need to activate the conda environment provided with the docker container, before starting the experiments.
conda ./<espnet-dir>/tools/venv/bin/activate
Results will be collected in <asr1>/tensorboard
and <asr1>/results
.
Results of the experiments are also included in this repository.
In case you modify experiments and generate the same outputs, you need to place results into the repository directory under results/raw/
and run run.py
.
To generate the outputs the Seaborn and Librosa packages are required.