MSc Dissertation

Data Preparation for Input to and Evaluation of neural Seq2Seq TTS Frontend

Code used in conjunction with an implementation of a Seq2Seq LSTM TTS frontend, to process and evaluate Google Research's Wikipedia Homograph Dataset (WHD) and LibriSpeech data, with the aim of improving the TTS frontend's homograph disambiguation abilities.

The data was processed to add supplementary POS tags (from Festival and SpaCy) as input to the model on a per-character basis, and also as part of a MultiTask Learning paradigm. For this, the WHD was also cleaned so that it could be entered to Festival without Out-of-Dictionary words.

Data was in the form:

with added POS tags: VBD VBD VDB VDB VDB VDB # VBP VBP VBP # DT DT DT # JJ JJ JJ JJ JJ JJ # NN NN NN NN NN NN #

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.idea		.idea
__pycache__		__pycache__
analysis		analysis
data-processing		data-processing
data		data
evaluation		evaluation
.DS_Store		.DS_Store
OOD_processing.py		OOD_processing.py
README.md		README.md
check_line_lengths.py		check_line_lengths.py
error_analysis.py		error_analysis.py
get_homograph_prons.py		get_homograph_prons.py
get_training_data.py		get_training_data.py
libri_analysis.py		libri_analysis.py
make_utts_data.py		make_utts_data.py
manage_WHD.py		manage_WHD.py
mtl_pos_accuracy.py		mtl_pos_accuracy.py
nn_vb_processing.py		nn_vb_processing.py
normalise_text.py		normalise_text.py
plot.py		plot.py
preprocess_normalised.py		preprocess_normalised.py
scrape.py		scrape.py
spacy_accuracy.py		spacy_accuracy.py
spacy_tests.py		spacy_tests.py
unilex-rpx.out		unilex-rpx.out

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MSc Dissertation

Data Preparation for Input to and Evaluation of neural Seq2Seq TTS Frontend

Model diagram:

About

Releases

Packages

Languages

eilishnewmark/msc_diss

Folders and files

Latest commit

History

Repository files navigation

MSc Dissertation

Data Preparation for Input to and Evaluation of neural Seq2Seq TTS Frontend

Model diagram:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages