Skip to content

tjysdsg/tone_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

  1. Place your own phone_ctm.txt file in project root dir, or use the default one generated from https://github.com/tjysdsg/aidatatang_force_align on AISHELL-3 data
  2. Run
  python feature_extration.py

to collect required statistics (phone start time, duration, tones, etc). Results are saved to utt2tones.json

  1. Run
  python trian/embedding/split_wavs.py

to split train, test, and validation dataset for embedding model training

The test utterances used in the paper are listed in test_utts.json

  1. Run
python train/train_embedding.py

to train embedding model, the results are in exp/

Mel-spectrogram cache is generated at exp/cache/spectro/wav.scp and exp/cache/spectro/*.npy

[Optional] Train an end-to-end tone recognizer

After step 3,

  • Run the following at e2e_tone_recog/
./run.sh