GitHub - tjysdsg/tone_classifier: Mandarin Tone Classifier

Place your own phone_ctm.txt file in project root dir, or use the default one generated from https://github.com/tjysdsg/aidatatang_force_align on AISHELL-3 data
Run

  python feature_extration.py

to collect required statistics (phone start time, duration, tones, etc). Results are saved to utt2tones.json

  python trian/embedding/split_wavs.py

to split train, test, and validation dataset for embedding model training

The test utterances used in the paper are listed in test_utts.json

python train/train_embedding.py

to train embedding model, the results are in exp/

Mel-spectrogram cache is generated at exp/cache/spectro/wav.scp and exp/cache/spectro/*.npy

[Optional] Train an end-to-end tone recognizer

After step 3,

./run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
e2e_tone_recog		e2e_tone_recog
train		train
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
TODO.md		TODO.md
aishell-asr-ssb-annotations.txt		aishell-asr-ssb-annotations.txt
aishell3.txt		aishell3.txt
feature_extraction.py		feature_extraction.py
generate_e2e_data.py		generate_e2e_data.py
generate_ssb_data_files.py		generate_ssb_data_files.py
get_dur.py		get_dur.py
optimize_cache_dir.py		optimize_cache_dir.py
phone_ctm.txt		phone_ctm.txt
requirements.txt		requirements.txt
test.wav		test.wav
test_utts.json		test_utts.json
train.py		train.py
utt2tones.json		utt2tones.json
validate_datasets.py		validate_datasets.py
visualize_spectro.ipynb		visualize_spectro.ipynb