GitHub - thetobysiu/witcher-3-data-pre-processing: Process raw witcher 3 script and exported audios in a single folder into separated folder of characters' names

Intro

This is used for pre-processing Witcher 3 audio and dialog data.

There are 5 files representing 5 steps.

1_convert_csv.py is for converting the w3dialog_id.txt into audio.csv
2_move_audio.py is for moving audio from a single wav folder into folders of the character's name
3_verify_files.py is to add a column in audio.csv indicating whether it exists in audio folder
4_separate_csv.py is to separate audio.csv into character's csv containing only that character dialog
5_create_script is to generate a dialog.txt suitable for gpt-2 fine-tuning

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
1_convert_csv.py		1_convert_csv.py
2_move_audio.py		2_move_audio.py
3_verify_files.py		3_verify_files.py
4_separate_csv.py		4_separate_csv.py
5_create_script.py		5_create_script.py
README.md		README.md
audio.csv		audio.csv
convert.sh		convert.sh
dialog.txt		dialog.txt
w3dialog_id.txt		w3dialog_id.txt