Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

Latest commit

 

History

History
24 lines (21 loc) · 959 Bytes

README.md

File metadata and controls

24 lines (21 loc) · 959 Bytes

Generative Spoken Language Modeling pipeline

Retrieve a language model

Assume you want to experiment with a pre-trained language model that is trained on HuBERT representations, quantized with a codebook of size 100. Firstly, you need to download and unpack the model itself:

mkdir LM/
wget https://dl.fbaipublicfiles.com/textless_nlp/gslm/hubert/lm_km100/hubert100_lm.tgz -O LM/hubert100_lm.tgz
cd LM/ && tar -xvf hubert100_lm.tgz

(other checkpoints can be found in the Textless NLP GSLM release.)

Run Speech Continuation on a file

To run the speech continuation pipeline with the previously downloaded models, you can use the following command:

python sample.py \
	--language-model-data-dir=LM/hubert100_lm \
	--input-file 174-84280-0004.flac \
	--output-file output_new.wav \
	--prompt-duration-sec=3 \
	--temperature=0.7 \
	--vocab-size=100