CHiME-6 Forced Alignment Annotation for CHiME-7 DASR Challenge

CHiME-7 DASR Challenge

Distant Automatic Speech Transcription with Multiple Devices in Diverse Scenarios

Baseline and dataset generation are in ESPNet2 chime7_dasr recipe

This repository contains forced alignment segmentation for the CHiME-6 dataset, produced as in https://github.com/nateanl/chime6_rttm (details are in the CHiME-6 challenge description paper [1]).

Important
Here we extend the forced alignment annotation also for the training set, and provide it also in the form of JSON files.

JSON forced alignment segmentation has a dummy value for the words entry (due to licensing we can't release the transcriptions here). The words field is not removed as it makes these more convenient to use in the baseline scripts e.g. in https://github.com/espnet/espnet/blob/master/egs2/chime7_task1/asr1/local/get_lhotse_manifests.py if one wants to use this segmentation in place of the manual one for GSS and ASR.

[1] Watanabe, S., Mandel, M., Barker, J., Vincent, E., Arora, A., Chang, X., et al. CHiME-6 challenge: Tackling multispeaker speech recognition for unsegmented recordings. https://arxiv.org/abs/2004.09249

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
dev		dev
eval		eval
train		train
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CHiME-6 Forced Alignment Annotation for CHiME-7 DASR Challenge

CHiME-7 DASR Challenge

Distant Automatic Speech Transcription with Multiple Devices in Diverse Scenarios

About

Releases

Packages

chimechallenge/CHiME6_falign

Folders and files

Latest commit

History

Repository files navigation

CHiME-6 Forced Alignment Annotation for CHiME-7 DASR Challenge

CHiME-7 DASR Challenge

Distant Automatic Speech Transcription with Multiple Devices in Diverse Scenarios

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages