This is for the Pytorch re-implementation of NSF models.
Two resources to mention:
Please visit this NSF home page It includes
- Audio samples for each NSF model
- Reference list
Detailed hands-on tutorials on NSF models available ../../tutorials/b1_neural_vocoder/ These tutorials are highly recommended. There is no need to set up the environment, and the notebook can run on Google Colab!
Note that the tutorial chapter chapter_a3_pretrained_vocoders.ipynb includes pre-trained NSF models on VoxCeleb2 dev and other speech datasets
Not all the models are re-implemented.
| - DATA: folder to store data
| - cyc-noise-nsf-4: cyclic-noise hn-sinc-NSF
| - hn-nsf: harmonic-plus-noise NSF
| - hn-sinc-nsf-9: harmonic-plus-noise NSF with a trainable sinc filter
| - hn-sinc-nsf-10: hn-sinc-nsf-9 with the BLSTM in condition module replaced by CNNs
| - hn-sinc-nsf-hifigan: hn-sinc-nsf 9 + hifi-gan discriminator
step.1 choose one project
cd hn-nsf
step.2 load dependency and PYTHONPATH
source ../../../
step.3 run script
This script will
- download the CMU database and pre-extracted features.
- generates audio using a pre-trained model and the pre-extracted features.
- trains a new model on the CMU data.
Pre-trained models are either included in __pre-trained
or downloaded through
. Training may take a few days or more. You may run
in the background.
bash >log_batch 2>&1 &
- To accelerate training: the default script uses
torch.backends.cudnn.deterministic= True
andtorch.backends.cudnn.benchmark = False
for reproducibility If you want to accelerate training, add options to the command line in
python --num-workers 10 --cudnn-deterministic-toggle --cudnn-benchmark-toggle
This will set torch.backends.cudnn.deterministic=False
and torch.backends.cudnn.benchmark = True
- To use a batch-size > 1:
python --num-wokers 10 --batch-size N
If you have any questions, please contact Xin.
That's all