Skip to content

pytorch implementation of DNN-HSMM for TTS

License

Notifications You must be signed in to change notification settings

sp-nitech/DNN-HSMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pytorch implementation of DNN-HSMM for TTS

  • This software is distributed under the BSD 3-Clause license. Please see LICENSE for more details.
  • Paper: Keiichi Tokuda, Kei Hashimoto, Keiichiro Oura, and Yoshihiko Nankaku, "Temporal modeling in neural network based statistical parametric speech synthesis,'' 9th ISCA Speech Synthesis Workshop, pp. 113-118, September, 2016. http://ssw9.talp.cat/papers/ssw9_OS2-2_Tokuda.pdf

Requirements

Usage

  • By running 00_data.sh, you can create serialized training and test data (npz) from pre-prepared linguistic and acoustic features (lab, lf0, mgc, bap). Directory names (dnames) and dimentions (dims) written in 00_data.sh need to be modified.
  • By running 01_run.py, you can train a model and generate acoustic features (featdims written in Config.py need to be modified). You can find generated features in 'gen' directory.

Demo

  • Japanese (m001)
$ cd DNN-HSMM
$ wget http://hts.sp.nitech.ac.jp/archives/DNN-HSMM_demo-data/demo_data_Japanese.tar.gz
$ tar -zxvf demo_data_Japanese.tar.gz
$ cp demo_data_Japanese/00_data.sh .
$ cp demo_data_Japanese/Config.py .
$ bash 00_data.sh
$ python 01_run.py
  • English (slt)
$ cd DNN-HSMM
$ wget http://hts.sp.nitech.ac.jp/archives/DNN-HSMM_demo-data/demo_data_English.tar.gz
$ tar -zxvf demo_data_English.tar.gz
$ cp demo_data_English/00_data.sh .
$ cp demo_data_English/Config.py .
$ bash 00_data.sh
$ python 01_run.py

demo samples

The acoustic features included in the demo data were extracted using STRAIGHT. Please see the HTS demo script (http://hts.sp.nitech.ac.jp/) for how to synthesize waveforms from acoustic features.

Who we are

Releases

No releases published

Packages

No packages published