A PyTorch Implementation of Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation
You should prepare the training, development, and test datasets by following the structure provided here. Each file should contain two columns:
- Path – The path to the audio file.
- Transcript – The corresponding transcript for the audio (the transcript should be normalized, such as removing all punctuation, converting to lowercase, etc., or you may need to modify the vocabulary).
pip install -r requirements.txt
python3 train.py
@article{SKD-CTC,
title={Guiding Frame-Level CTC Alignments Using Self-knowledge Distillation},
author={Eungbeom Kim, Hantae Kim, Kyogu Lee},
journal={INTERSPEECH 2024},
year={2024},
}