This repository contains scripts that were used to pre-process YouTube transcripts before doing a multi-dimensional analysis.
More information is available in the paper 'The pre-processing of YouTube transcripts for corpus-based spoken language analysis' p. 1-6 here: https://jaecs.com/conf_48/images/Proceedings48.pdf
Cooper, C.R. (2022). The pre-processing of YouTube transcripts for corpus-based spoken language analysis. Proceedings of the JAECS Conference 2022. 1-6.