TTS_dataset_creator

create dataset from list of youtube links easily

Setup repo

git clone https://github.com/m-bain/whisperX
git clone https://github.com/GregorR/rnnoise-models.git

Install ffmpeg > 5.0

for rnnnoise support for using different rnnnoise model you can change rnnnoise path in th bash script.

conda install -c conda-forge ffmpeg==5.1.0

Setup whisperx

$ git clone https://github.com/m-bain/whisperX.git
$ cd whisperX
$ pip install -e .

Speaker Diarization

To enable Speaker. Diarization, include your Hugging Face access token that you can generate from Here after the --hf_token argument and accept the user agreement for the following models: Segmentation , Voice Activity Detection (VAD) , and Speaker Diarization

update this hf_token in youtube_to_vctk.sh script

Run crawling script

add all the links in one txt file and then run youtube_to_vctk.sh links.txt out_folder_name

vctk dataset will be in output folder.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LICENSE		LICENSE
README.md		README.md
cut_and_export.py		cut_and_export.py
data_analysis.py		data_analysis.py
download_videos.sh		download_videos.sh
requirements.txt		requirements.txt
youtube_to_vctk.sh		youtube_to_vctk.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TTS_dataset_creator

Setup repo

Install ffmpeg > 5.0

Setup whisperx

Speaker Diarization

Run crawling script

About

Releases

Packages

Languages

License

manmay-nakhashi/TTS_dataset_creator

Folders and files

Latest commit

History

Repository files navigation

TTS_dataset_creator

Setup repo

Install ffmpeg > 5.0

Setup whisperx

Speaker Diarization

Run crawling script

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages