Bangla Speech Corpora

Bangla cleaned speech corpus, specially developed for Bangla Text to Speech back in 2009. It is orginally hosted in sourceforge.

Characterstics of the corpus

This dataset consists of three different corpora and those were developed for three different purposes.

“Corpus for acoustic analysis” was developed for acoustic analysis of Bangla phoneme inventory.
“Diphone corpus” was developed for diphone concatenation based speech synthesis.
“Continuous speech corpus” was developed for intonation model and unit selection based speech synthesis.

Other characterstics include:

Area of speech corpora: Speech synthesis, phonetic research and speech recognition.
Spoken content: Two approaches considered such as domain and phonological distribution.
Professional recording studio: This is necessary for a clear acoustic signal from which it is possible to get clear acoustic information.
Speaking style: Continuous read speech.
Manual segmentation: Though this leads to significant amount of effort but it also affirm the accuracy of the labeling.
Recording setup: Supervised onsite recording.

Download

Due to the size of the corpora (4.4GB) we uploaded data on mendeley and also kept the data on sourceforge.

Option 1: Please follow mendeley page.

Option 2: sourceforge.

Please Cite this paper:

Firoj Alam, S. M. Murtoza Habib, Dil Afroza Sultana and Mumit Khan, Development of Annotated Bangla Speech Corpora, Spoken Language Technologies for Under-resourced language (SLTU’10), vol 1, pp-35-41, Penang, Malasia, May 3 - 5, 2010.paper

@inproceedings{alam2010development,
  title={Development of annotated Bangla speech corpora},
  author={Alam, Firoj and Habib, SM Murtoza and Sultana, Dil Afroza and Khan, Mumit},
  booktitle={Spoken Languages Technologies for Under-Resourced Languages},
  year={2010}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bangla Speech Corpora

Characterstics of the corpus

Download

Please Cite this paper:

About

Releases

Packages

Bangla-Language-Processing/Bangla-Speech-Corpora

Folders and files

Latest commit

History

Repository files navigation

Bangla Speech Corpora

Characterstics of the corpus

Download

Please Cite this paper:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages