RPT-TTS

RPT-TTS is a web platform for collecting error annotation data online for speech synthesis evaluation, based on the LMEDS platform by Tim Mahrt.

LMEDS was developed for conducting Rapid Prosody Transcription (RPT) annotations over the internet. RPT-TTS adapts LMEDS for text-to-speech (TTS) evaluation.

RPT-TTS implements the experimental procedure that was used in: Gutierrez, E., Oplustil-Gallegos, P., Lai, C. (2021) Location, Location: Enhancing the Evaluation of Text-to-Speech synthesis using the Rapid Prosody Transcription Paradigm. Proc. 11th ISCA Speech Synthesis Workshop (SSW 11), 25-30, doi: 10.21437/SSW.2021-5.

RPT-TTS builds upon the original LMEDS platform by:

Embedding multiple tasks in one page, namely the RPT and MOS tasks;
Adding support for Latin square design experiments;
Showing the MOS slider value for the user

Contents

1 Requirements
2 Usage
3 Running create_experiment.py
4 Running bulk_post_process.py
5 Installation
6 Contact
7 Citing RPT-TTS/LMEDS

1 Requirements

Python 3.3.* or above

Bash (if on Windows use Git Bash)

2 Usage

Before using RPT-TTS, you will need:

Audio samples of synthetic speech from different systems in either mp3 or wav format;
A stimulus file. This is a txt file containing all the stimuli used for the experiment, each on their own line with no quotes.

NB: The audio samples need to conform to the following naming convention:

[STIMULUS FILE NAME]_[ID]_[SYSTEM]

Stimulus file name: the stem of the stimulus file. Id: number of stimulus in the stimulus file (e.g. 15th entry has an id of 15). System: a shorthand for the system used to synthesise the stimulus (e.g. "tac" for Tacotron or "fast" for FastPitch).

e.g. suppose your stimulus file is called libri_isolated.txt, and you have three systems to evaluate: slt (Festival slt), oph (Ophelia), tac (Tacotron). Suppose also that the first sentence in the stimulus file is Then Anders felt brave again. Then the audio sample of the slt system for this stimulus (Then Anders felt brave again) should be denoted as libri_isolated_1_slt. For the tac system this would be libri_isolated_1_tac, and so on.

Once all the audio samples conform to this naming convention, an experiment can be created. If a Latin square design is not required, replace the stimulus file in rpt_tts_demo/stimuli and the audio samples in rpt_tts_demo/audio_and_video with your own. Then, copy the ./tests/rpt_tts_demo directory and refer to the LMEDS manual to customise specific parts of the experiment. The lmeds_demo directory provided may also help for understanding how LMEDS works.

If a Latin square design is necessary, follow these steps:

Change the HOME path in setup.sh to the path of the cloned directory
Run source setup.sh
Place the audio samples in the master_audio_and_video folder
Place the stimulus file in the master_stimuli folder
Customise the consent form in english.txt. english.txt is the dictionary file and can be found in ./tests/rpt_tts_demo. For more details on the dictionary file, refer to the LMEDS manual. The relevant fields to modify are consent_title and consent_form
(Optional): customise pmos_question.txt in the ./tests/rpt_tts_demo folder. This is the question that will be shown to participants e.g. How natural is the intonation of the speaker?
Run create_experiment.py (see Section 3)
Run lmeds_local_server.py to test the experiments on your local machine. Refer to the LMEDS manual for more details
Once all the data has been collected, run bulk_post_process.py (see Section 4)
You are now ready to analyse the processed data generated in ./lmeds/user_scripts/csvs_and_xlsx!

3 Running create_experiment.py

python3 create_experiment.py [EXPERIMENT_NAME] [EXPERIMENT_ALIAS] [STIMULUS_FILE_NAME] [AUDIO_EXTENSION] [SYSTEM_1] [SYSTEM_2] [SYSTEM_3] ...

Example run: python3 create_experiment.py rpt_tts_demo DEMO_experiment libri_isolated mp3 slt oph tac

This script builds an experiment based on the stimulus file, the audio samples provided, and the systems to evaluate. The samples and systems are arranged in a Latin square design.

The experiment alias is an alternate name for the experiment which will be used for the first line of the sequence file and as the name of various output folders (see LMEDS manual for details). The convention is to have a short version of the experiment name in uppercase followed by _experiment, e.g. DEMO_experiment. The audio extension argument can either be mp3 or wav, depending on the format of the audio samples. The final arguments are the shorthand system names. These should match the shorthand system names that were used when naming the audio samples.

4 Running bulk_post_process.py

python3 bulk_post_process.py [EXPERIMENT_NAME]

Example run: python3 bulk_post_process.py rpt_tts_demo

This script processes the data from all listener groups in the specified experiment and compiles the data in both csv and xlsx formats. The outputs of the script can be found in the ./lmeds/user_scripts/csvs_and_xlsx directory.

Tim Mahrt. LMEDS: Language markup and experimental design software. https://github.com/timmahrt/LMEDS, 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
cgi-bin		cgi-bin
html		html
imgs		imgs
integration_tests		integration_tests
lmeds		lmeds
master_audio_and_video		master_audio_and_video
master_stimuli		master_stimuli
master_tests		master_tests
tests		tests
user_manual		user_manual
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.rst		README.rst
create_experiment.py		create_experiment.py
favicon.ico		favicon.ico
lmeds_local_server.py		lmeds_local_server.py
setup.py		setup.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RPT-TTS

1 Requirements

2 Usage

3 Running create_experiment.py

4 Running bulk_post_process.py

5 Installation

6 Contact

7 Citing RPT-TTS/LMEDS

About

Releases

Packages

Languages

License

ElijahGut/RPT-TTS

Folders and files

Latest commit

History

Repository files navigation

RPT-TTS

About

Topics

Resources

License

Stars

Watchers

Forks

Languages