22 People - Chinese Mandarin Multi-emotional Synthesis Corpus. It is recorded by Chinese native speaker, covering different ages and genders. six emotional text, and the syllables, phonemes and tones are balanced. Professional phonetician participates in the annotation. It precisely matches with the research and development needs of the speech synthesis.
For more details, please refer to the link: https://www.nexdata.ai/datasets/tts/1214?source=Github
48,000Hz, 24bit, uncompressed wav, mono channel
professional recording studio
seven emotions (happiness, anger, sadness, surprise, fear, disgust)
22 persons, different age groups and genders
microphone
Mandarin
word and pinyin transcription, prosodic boundary annotation
speech synthesis
The amount of data for per person is 140 minutes, each emotion is 20 minutes
Commercial License