Package designed for word-level audio segmentation. It combines two techniques, Whisper Transcription and PyTorch forced alignment. Prepared based on the utilities I needed for my research.
ToDo: Install using pip
pip install chatterchop
Alternatively,
- Clone the repo:
git clone https://github.com/najdamikolaj00/ChatterChop.git
- Navigate to the repo:
cd chatterchop
- Install the repo:
pip3 install .
[
test_audio_eng.mp4
THE_0.mp4
BLUE_1.mp4
SPOT_2.mp4
and so on... audio source: Walden, Patrick R (2020), “Perceptual Voice Qualities Database (PVQD)”, Mendeley Data, V1, doi: 10.17632/9dz247gnyb.1
|-- chatterchop/
| |-- tutorial/
| | |-- ChatterChop.py
from chatterchop.ChatterChopper import ChatterChopper
# Polish example: Create an object with a path to audio, and provide a path to the output directory to save segmented samples.
test_audio_path_pl = 'data_to_test/test_pl/test_audio_pl.wav'
output_dir_pl = 'data_to_test/test_pl/test_split_pl'
test_obj_pl = ChatterChopper(test_audio_path_pl)
test_obj_pl.chop_chatter()
test_obj_pl.save_speech_segments(output_dir_pl)
# English example: Create an object with a path to audio, and provide a path to the output directory to save segmented samples.
test_audio_path_eng = 'data_to_test/test_eng/test_audio_eng.wav'
output_dir_eng = 'data_to_test/test_eng/test_split_eng'
test_obj_eng = ChatterChopper(test_audio_path_eng)
test_obj_eng.chop_chatter()
test_obj_eng.save_speech_segments(output_dir_eng)
from chatterchop.ChatterChopper import ChatterChopper
test_audio_path = 'data_to_test/test_audio_pl_shorter.wav'
test_ground_truth_path = 'data_to_test/test_transcription_ground_truth.txt'
# Option 1: Create an object with just a path to audio, and provide a ground truth transcript as a path to a file, get transcription accuracy and save transcription to a text file.
test_obj_1 = ChatterChopper(test_audio_path)
transcription_result = test_obj_1.get_transcription_accuracy(test_ground_truth_path)
print(transcription_result)
test_obj_1.save_transcription('data_to_test/saved_trans.txt')
# Option 2: Create an object with a path to audio and transcription, and provide a ground truth transcript as a path to a file to get transcription accuracy.
test_obj_2 = ChatterChopper(test_audio_path, test_transcription_file)
transcription_result = test_obj_2.get_transcription_accuracy(test_ground_truth_path)
print(transcription_result)
# Option 3: Create an object with a path to audio and transcription, and provide a ground truth transcript as a string to get transcription accuracy.
test_obj_3 = ChatterChopper(test_audio_path, test_transcription_file)
test_ground_truth = 'Warszawa jest pełnym sprzeczności przez wielu niezniszczalnym.'
transcription_result = test_obj_3.get_transcription_accuracy(test_ground_truth)
print(transcription_result)