GitHub

This work is based on earlier work in Zhang et al. (2019)

Generating Input Cochleagrams

Download the TIMIT corpus.
To generate cochleagrams feature vectors:
1. To generate features for all files at once:
  1. util.generate_mat_for_all_data_in_dir(data_path, output_path)
    1. data_path is the path to the data directory in the TIMIT corpus.
    2. output_path is the path where the cochleagrams feature vectors will be saved.
    3. file_type can be set to "npy" if python SHMAX implementation is used. "mat" is default
2. To generate features for a single file:
  1. util.save_features_as_mat(wav_path)
    1. wav_path is the path to the wav file
    2. file_type can be set to "npy" if python SHMAX implementation is used. "mat" is default
    3. A second argument can be supplied to specify the output path.
    4. Feature vector is returned as a Numpy array.

Note: Using these functions to generate cochleagrams feature vectors may result in a poorly trained model. You may want to look into using the Matlab Auditory Toolbox instead.

Training the SHMAX Model

SHMAX.train_SHMAX(train_path, output_path)
1. train_path is the path to the directory containg the cocheagrams feature vectors.
2. output_path is the path where the model output will be saved.

ABX Testing on Input Cochleagrams

cochABX.generate_phoneme_matrices(corpus_data_path, phoneme_feature_path, result_path)
1. corpus_data_path is the path to the data directory in the TIMIT corpus.
2. phoneme_feature_path is the path to the directory where the phoneme feature vectors are saved.
3. result_path is the path where the resulting phoneme matrices will be saved.
4. Note: Files will be of type .mat
Create categories to perform ABX testing by
1. Create a 2D list such that each sublist contains the phoneme names of a category. e.g. [['aa', 'ae', 'ah'], ['ao', 'aw', 'ax'], ...]
2. Pass this list to cochABX.convert_category_list_to_dict(cat_list) and store the resulting dictionary.

Perform Pair-Wise Machine ABX Testing on Input Cochleagrams

cochABX.abx_testing(phoneme_mat_path, categories, num_categories)
phoneme_mat_path is the path to the directory containing the phoneme matrices.
categories is the dictionary containing the phoneme names of the categories.
num_categories is the number of categories.
Returns confusion matrix where the rows represent the true category and the columns represent the predicted category.
If result_path is supplied, the confusion matrix will be saved to the specified path.

Perform General Machine ABX Classification on Input Cochleagrams

cochABX.general_classification_abx_testing(phoneme_mat_path, categories, num_categories)
phoneme_mat_path is the path to the directory containing the phoneme matrices.
categories is the dictionary containing the phoneme names of the categories.
num_categories is the number of categories.
Returns confusion matrix where the rows represent the true category and the columns represent the predicted category.
If result_path is supplied, the confusion matrix will be saved to the specified path.

Get SHMAX Phoneme/Category Response Matrix

getPhonemeResponse.get_phoneme_response(corpus_data_path, SHMAX_data_path)
1. corpus_data_path is the path to the data directory in the TIMIT corpus.
2. SHMAX_data_path is the path to the directory containing the SHMAX model output.
3. Returns an m x n matrix where m is the number of phonemes and n is the number of computational units. At index (i, j) a list of the responses of the jth computational unit to the ith phoneme is stored.
4. If result_path is supplied, the phoneme response matrix will be saved to the specified path.
5. The categories argument can be used to specify categories to calculate responses for rather than individual phonemes. The correct form for this argument can be generated by following step 2 of ABX Testing
  1. If this argument is supplied, num_categories must also be supplied.

Get PSI/CSI Matrix

Note: CSI refers to the specificity towards some arbitrary category assignment rather than individual phonemes. If a CSI is desired simply calculate the response matrix for the desired category assignment.

getPSI.get_psi(responses)
1. responses is the phoneme response matrix.
2. Returns a m x n matrix where m is the number of phonemes and n is the number of computational units. At index (i, j) the CSI/PSI of the jth computational unit to the ith phoneme is stored.
3. If result_path is supplied, the CSI/PSI matrix will be saved to the specified path.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
README.md		README.md
SHMAX.py		SHMAX.py
SHMAX_2DS.py		SHMAX_2DS.py
SHMAX_3DS.py		SHMAX_3DS.py
SHMAX_C.py		SHMAX_C.py
clusterPSI.py		clusterPSI.py
cochABX.py		cochABX.py
cochleogram.py		cochleogram.py
getActiveUnits.py		getActiveUnits.py
getPSI.py		getPSI.py
getPhonemeResponse.py		getPhonemeResponse.py
parameters.py		parameters.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating Input Cochleagrams

Training the SHMAX Model

ABX Testing on Input Cochleagrams

Perform Pair-Wise Machine ABX Testing on Input Cochleagrams

Perform General Machine ABX Classification on Input Cochleagrams

Get SHMAX Phoneme/Category Response Matrix

Get PSI/CSI Matrix

About

Uh oh!

Releases

Packages

Languages

AndoniSanguesa/SHMAX

Folders and files

Latest commit

History

Repository files navigation

Generating Input Cochleagrams

Training the SHMAX Model

ABX Testing on Input Cochleagrams

Perform Pair-Wise Machine ABX Testing on Input Cochleagrams

Perform General Machine ABX Classification on Input Cochleagrams

Get SHMAX Phoneme/Category Response Matrix

Get PSI/CSI Matrix

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages