AutoPreaspiration

A software package for automatic extraction of pre-aspiration from speech segments in audio files, using a trainable algorithm.

The package is based on AutoVot project - automatic extraction of voice onset time (VOT) from audio files (2014 Joseph Keshet, Morgan Sonderegger, Thea Knowles). The core implemantion of AutoVot was used to train a new model for the task of pre-aspiration detection. Some of the cpp files were changed to fit the new task. In addition, some of the wrapper python files were changed and many new scripts and tests were added (mainly python and bash scripts).

It works as follows:

The user provides wav files containing one (or more) obstruents, and corresponding Praat TextGrids containing some information about roughly where the pre-aspiration should be (e.g. time index within the preceding phoneme and time index within the obstruent).
A classifier is used to find the pre-aspiration, for each coded obstruent, and add a new tier to each TextGrid containing these measurements.
The user can either use a pre-existing classifier, or (recommended) train a new one using a manually-labeled pre-aspirations from their own data.

Note: A newer project for extracting pre-aspiration using Recurrent Neural Network based method can be found here: BiRNN_AutoPreaspiration

Dependencies

In order to use AutoPreaspiration you'll need the following installed in addition to the source code provided here:

GCC, the GNU Compiler Collection
Python (Version 2.7 or earlier)
If you're using Python version 2.6 or earlier, you will need to install the argparse module (which is installed by default in Python 2.7), e.g. by running easy_install argparse on the command line.
If you're using Mac OS X you'll need to download GCC, as it isn't installed by default. You can either:
- Install Xcode, then install Command Line Tools using the Components tab of the Downloads preferences panel.
- Download the Command Line Tools for Xcode as a stand-alone package.
You will need a registered Apple ID to download either package.

Installation

The code to clone AutoPreaspiration is:

$ git clone https://github.com/Yanivs24/AutoPreaspiration.git

Alternatively, you can download the current version of AutoPreaspiration as a zip file, just press "Clone or download" -> "Download ZIP"

Compiling

Clean and compile from the code directory:

$ cd AutoPA/AutoPA/code
$ make clean

Then, run:

$ make

Final line of the output should be:

[make] Compiling completed

Finally, add the path to code to your experiments path:

$ cd ../../experiments
$ export PATH=$PATH:/[YOUR PATH HERE]/AutoPA/AutoPA/bin

For example:
$ export PATH=$PATH:/home/yaniv/projects/AutoPA/AutoPA/bin

If not working out of the given experiments directory, you must add the path to your intended working directory. IMPORTANT: YOU MUST ADD THE PATH EVERY TIME YOU OPEN A NEW TERMINAL WINDOW

Setup

First, make sure your working directory is the experiments directory.

In order to work with a new data, the wav files should be converted to 16kHz mono (if this is already the case this step should be skipped). This can be done by typing:

$ python python_scripts/format_wav_files.py DIRECTORY_PATH

A prerequisite for this script is installing SoX utility

When 'DIRECTORY_PATH' is the path of the directory containing the data (wav&TextGrids) This will place all the formatted wav files and the corresponding TextGrids in: DIRECTORY_PATH/formated

Usage

You can skip this and go to the Simple Usage Mode for a simpler user mode (but with less options).