Model

Methodology

We are fine-tuning Whisper for audio to phonemes transcription trained using the TIMIT dataset.

As we want the model to output phonemes instead of text, we need to construct a custom tokeniser. We can reuse the pre-trained feature extractor. We take the pre-trained tiny.en model, freezing the encoder layers and fine-tuning the decoder layers, changing the output layer dimension for phonemes.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
tokenizer		tokenizer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data_analysis.ipynb		data_analysis.ipynb
demo.ipynb		demo.ipynb
requirements.txt		requirements.txt
training.ipynb		training.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model

Methodology

About

Releases

Packages

Contributors 2

Languages

License

imperial-pronunciation-app/model-experiments

Folders and files

Latest commit

History

Repository files navigation

Model

Methodology

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages