Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Latest commit

 

History

History
17 lines (14 loc) · 936 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 936 Bytes

Discrete Resynthesis example

In resynth.py we showcase a simple demonstration of the audio resynthesis done via HuBERT-based discrete pseudo-units. The code closesly follows the unit2speech module of GSLM.

How to run

Below is an example of running the script:

python resynth.py --input test_input.wav --output=test_output.wav --vocab_size=100 --decoder_steps=500

resynth.py supports the following command-line arguments:

  • --dense_model_name: name of the dense representation model to be used (suppported: hubert-base-ls960 and cpc-big-ll6k);
  • --input: the input audio file (must have the sample rate of 16 KHz);
  • --output: the output file name;
  • --vocab_size: the size of the quantization vocabulary to be used (one of 50, 100, 200);
  • --decoder_steps: determines the maximal duration of the produces audio.