QuartzNet

A PyTorch implementation of QuartzNet, an End-to-End ASR on LJSpeech dataset.

Usage

Set preferred configurations in config.py and run ./run_docker.sh (don't forget about correct volume option)

Training

wget https://data.keithito.com/data/speech/LJSpeech-1.1.tar.bz2   # download data
tar xjf LJSpeech-1.1                                              # unzip data
python train.py

You will need to log in to your account in wandb.ai for monitoring logs.

Every 10'th checkpoint after 40'th epoch will be saved in model{epoch}.pth.

Inference

Set path_to_file with .wav in config, from_pretrained=True, then

python inference.py

The result will be saved in path_to_file.txt

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
config.py		config.py
inference.py		inference.py
model100.pth		model100.pth
requirements.txt		requirements.txt
run_docker.sh		run_docker.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuartzNet

Usage

Training

Inference

About

Releases

Packages

Languages

tabisheva/speech-recognition-quartznet

Folders and files

Latest commit

History

Repository files navigation

QuartzNet

Usage

Training

Inference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages