This is for the Pytorch re-implementation of NSF models.
Two resources to mention:
-
Please visit this NSF home page https://nii-yamagishilab.github.io/samples-nsf/. It includes
- Audio samples for each NSF model
- Reference list
-
Detailed hands-on tutorials on NSF models available ../../tutorials/b1_neural_vocoder/README.md. These tutorials are highly recommended. There is no need to set up the environment, and the notebook can run on Google Colab!
-
Note that the tutorial chapter chapter_a3_pretrained_vocoders.ipynb includes pre-trained NSF models on VoxCeleb2 dev and other speech datasets
.
Not all the models are re-implemented.
| - DATA: folder to store data
|
| - cyc-noise-nsf-4: cyclic-noise hn-sinc-NSF
|
| - hn-nsf: harmonic-plus-noise NSF
|
| - hn-sinc-nsf-9: harmonic-plus-noise NSF with a trainable sinc filter
|
| - hn-sinc-nsf-10: hn-sinc-nsf-9 with the BLSTM in condition module replaced by CNNs
|
| - hn-sinc-nsf-hifigan: hn-sinc-nsf 9 + hifi-gan discriminator
step.1 choose one project
cd hn-nsf
step.2 load dependency and PYTHONPATH
source ../../../env.sh
step.3 run script
bash 00_demo.sh
This script will
- download the CMU database and pre-extracted features.
- generates audio using a pre-trained model and the pre-extracted features.
- trains a new model on the CMU data.
Pre-trained models are either included in __pre-trained
or downloaded through 00_demo.sh
. Training may take a few days or more. You may run 00_demo.sh
in the background.
bash 00_demo.sh >log_batch 2>&1 &
- To accelerate training: the default script uses
torch.backends.cudnn.deterministic= True
andtorch.backends.cudnn.benchmark = False
for reproducibility https://pytorch.org/docs/stable/notes/randomness.html. If you want to accelerate training, add options to the command line in 00_demo.sh
python main.py --num-workers 10 --cudnn-deterministic-toggle --cudnn-benchmark-toggle
This will set torch.backends.cudnn.deterministic=False
and torch.backends.cudnn.benchmark = True
- To use a batch-size > 1:
python main.py --num-wokers 10 --batch-size N
If you have any questions, please contact Xin.
That's all