Skip to content

Latest commit

 

History

History
103 lines (73 loc) · 4.07 KB

README.md

File metadata and controls

103 lines (73 loc) · 4.07 KB
ChatterChop Logo

Package designed for word-level audio segmentation. It combines two techniques, Whisper Transcription and PyTorch forced alignment. Prepared based on the utilities I needed for my research.

Workflow

ChatterChop Workflow

🎤 Installation 🎤

ToDo: Install using pip

pip install chatterchop

Alternatively,

  • Clone the repo: git clone https://github.com/najdamikolaj00/ChatterChop.git
  • Navigate to the repo: cd chatterchop
  • Install the repo: pip3 install . [

    Showcase (short words such as "the" need a bit of work) Turn audio on 📢

test_audio_eng.mp4
THE_0.mp4
BLUE_1.mp4
SPOT_2.mp4

and so on... audio source: Walden, Patrick R (2020), “Perceptual Voice Qualities Database (PVQD)”, Mendeley Data, V1, doi: 10.17632/9dz247gnyb.1

🐊 Tutorial 🐊

|-- chatterchop/
|   |-- tutorial/
|   |   |-- ChatterChop.py

Example 1 (chopping chatter;))

from chatterchop.ChatterChopper import ChatterChopper

# Polish example: Create an object with a path to audio, and provide a path to the output directory to save segmented samples.

test_audio_path_pl = 'data_to_test/test_pl/test_audio_pl.wav'
output_dir_pl = 'data_to_test/test_pl/test_split_pl'

test_obj_pl = ChatterChopper(test_audio_path_pl)

test_obj_pl.chop_chatter()

test_obj_pl.save_speech_segments(output_dir_pl)

# English example: Create an object with a path to audio, and provide a path to the output directory to save segmented samples.

test_audio_path_eng = 'data_to_test/test_eng/test_audio_eng.wav'
output_dir_eng = 'data_to_test/test_eng/test_split_eng'

test_obj_eng = ChatterChopper(test_audio_path_eng)

test_obj_eng.chop_chatter()

test_obj_eng.save_speech_segments(output_dir_eng)

Example 2 (getting transcription and metrics)

from chatterchop.ChatterChopper import ChatterChopper

test_audio_path = 'data_to_test/test_audio_pl_shorter.wav'
test_ground_truth_path = 'data_to_test/test_transcription_ground_truth.txt'

# Option 1: Create an object with just a path to audio, and provide a ground truth transcript as a path to a file, get transcription accuracy and save transcription to a text file.
test_obj_1 = ChatterChopper(test_audio_path)

transcription_result = test_obj_1.get_transcription_accuracy(test_ground_truth_path)
print(transcription_result)

test_obj_1.save_transcription('data_to_test/saved_trans.txt')

# Option 2: Create an object with a path to audio and transcription, and provide a ground truth transcript as a path to a file to get transcription accuracy.
test_obj_2 = ChatterChopper(test_audio_path, test_transcription_file)

transcription_result = test_obj_2.get_transcription_accuracy(test_ground_truth_path)
print(transcription_result)

# Option 3: Create an object with a path to audio and transcription, and provide a ground truth transcript as a string to get transcription accuracy.
test_obj_3 = ChatterChopper(test_audio_path, test_transcription_file)

test_ground_truth = 'Warszawa jest pełnym sprzeczności przez wielu niezniszczalnym.'
transcription_result = test_obj_3.get_transcription_accuracy(test_ground_truth)
print(transcription_result)

TO DO:

-Tests -Other languages -Converting numbers to appropriate words -Different use cases etc.