Ground Truth:
Prediction:
Training data of the vocoders is provided and used under permissions by the following organizations, societies or individuals:
The following public datasets are used:
Dataset | Link |
---|---|
Opencpop | https://wenet.org.cn/opencpop/ |
CCMUSIC | https://ccmusic-database.github.io/index.html |
SingingVoiceDataset | http://isophonics.net/SingingVoiceDataset |
The model weights are licensed under the CC BY-NC-SA 4.0. Anyone who distributes the model weights should include a copy of the license, a notice informing that the models are provided by the OpenVPI Community (or DiffSinger Community), and a link referring this page (or a complete contribution list).
model | date | specifications | dataset | iters | link |
---|---|---|---|---|---|
NSF-HiFiGAN | 2022-12-11 | 44.1 kHz sampling rate, hop size 512, 128 mel bins, input frequency 40-16000 | ~93h singing | >= 1M | link |
NSF-HiFiGAN | 2024-02-19 | 44.1 kHz sampling rate, hop size 512, 128 mel bins, input frequency 40-16000 | ~72h singing (for fine-tuning) | 110K | link |