Skip to content

I harmonized diverse speech emotion datasets and developed convolutional neural network (CNN) models, including Mel Spectrogram CNN, MFCCs CNN, and Mel Spectrogram CRNN, achieving a 72% accuracy in emotion classification.

Notifications You must be signed in to change notification settings

arzuisiktopbas/Speech-Emotion-Recognition

Repository files navigation

Speech-Emotion-Recognition

This research project undertakes a comprehensive analysis of speech emotion recognition. Har- monizing datasets involves negotiating disparate naming conventions and emotional expressions, establishing a standardized format for cohesive analysis. The distribution of emotions, excluding surprise, calm, and neutral, is well-balanced. The subsequent focus is on feature extraction through raw audio waveforms, frequency spectrum (FFT), short-time Fourier transform (STFT) spectrograms, and mel spectrograms. Three distinct convolutional neural network (CNN) models—Mel Spectrogram CNN, MFCCs CNN, and Mel Spectrogram CRNN—are developed and evaluated. Results indicate a 72% accuracy in classifying emotions, with Mel spectrogram and MFCC features displaying complementary strengths. The study concludes by suggesting avenues for improvement, emphasizing feature fusion, exploring specialized deep learning models, and addressing data imbalance. Future work involves real-life integration, applying sentiment analysis to predict stock market effects based on emotion-laden communications in S&P 500 earnings calls.

About

I harmonized diverse speech emotion datasets and developed convolutional neural network (CNN) models, including Mel Spectrogram CNN, MFCCs CNN, and Mel Spectrogram CRNN, achieving a 72% accuracy in emotion classification.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published