Skip to content

An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery

Notifications You must be signed in to change notification settings

huiw39/ExtensibleTTS-PyTorch

Repository files navigation

ExtensibleTTS-PyTorch

An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery . You will find it easy to train acoustic model by employing popular models such as tacotron's encoder, deepvoice's encoder, transformer's encoder and any other you created.

Quick Start

Dependencies

Prepare Dataset

Note: the repo requires wav files with aligned HTS-style full-context lablel files.

  1. Download a dataset

    cmu_slt_arctic

  2. Unpack the dataset into ~/ExtensibleTTS-PyTorch/datasets

    After unpacking, your tree should look like this for cmu_slt_arctic:

    ExtensibleTTS-PyTorch   
      |- datasets    
          |- slt_arctic_full_data
              |- label_phone_align
              |- label_state_align
              |- wav
              |- file_id_list_full.scp
              |- questions-radio_dnn_416.hed
    

Training

  1. Preprocess the data to extract linguistic/duration/acoustic feature
python preprocess.py --label state_align
  • Use --label phone_align
  1. Count min/max/mean/var/scale value of the data for input/output feature normalization
python norm_params.py
  1. Train a model
python train_dnn.py --train_model duration
  • Use --train_model acoustic for training a acoustic model
  1. Label to speech waveform from a duration/acoustic checkpoint
python synthesis.py --label state_align --duration_checkpint * --acoustic_checkpint *
  1. Restore from a checkpoint
python train.py --restore_step *

WIP

  • combined with MTTS, the Mandarin frontend
  • batch inference for synthesis speedup
  • scheduled sampling
  • model pruning

Reference

About

An extensible speech synthesis system, build with PyTorch and the original code is from r9y9's https://github.com/r9y9/nnmnkwii_gallery

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages