97-Hours-German-Children-Spontaneous-Speech-Data

Description

The 97 Hours - German Child's Spontaneous Speech Data, manually screened and processed. Annotation contains transcription text, speaker identification, gender and other informantion. This dataset can be applied in speech recognition (acoustic model or language model training), caption generation, voice content moderation and other AI algorithm research.

For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1299?source=Github

Specifications

Format

16k Hz, 16 bit, wav, mono channel;

Age

12 years old and younger children;

Content category

including self-media, conversation, live, lecture, variety show;

Language

German;

Annotation

annotation for the transcription text, speaker identification, gender;

Accuracy

Word Accuracy Rate (WAR) at least 98%.

Licensing Information

Commercial License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

97-Hours-German-Children-Spontaneous-Speech-Data

Description

Specifications

Format

Age

Content category

Language

Annotation

Accuracy

Licensing Information

Files

README.md

Latest commit

History

README.md

File metadata and controls

97-Hours-German-Children-Spontaneous-Speech-Data

Description

Specifications

Format

Age

Content category

Language

Annotation

Accuracy

Licensing Information