Emojify: Real-Time Facial Emotion Detection and Emoji Reactions

A deep learning project that enables real-time facial emotion recognition and responds with matching emoji reactions. Built using CNNs, attention mechanisms (SE blocks), and Vision Transformers (ViTs), the project demonstrates the strengths of modern AI for human-computer interaction through facial expressions.

🔍 Abstract

Facial emotion recognition is critical in applications like surveillance, healthcare, driver safety, and entertainment. This project implements and compares three architectures:

A baseline CNN
An SE-augmented attention CNN
A hybrid CNN+Vision Transformer (ViT)

These models were trained and evaluated on the FER2013 and a subset of AffectNet datasets using techniques like focal loss, data augmentation, and class weighting. Real-time inference is achieved using OpenCV to overlay detected emotions live from webcam input.

📂 Datasets Used

FER2013

35,887 grayscale images (48x48 px)
7 emotions: Angry, Disgust, Fear, Happy, Neutral, Sad, Surprise

⚠️ Important:
Download the FER2013 dataset from this Kaggle link
Once downloaded, extract and place it inside your working directory like so:
emojify/
┗ data/
   ┣ train/
   ┗ test/

AffectNet (subset)

12,815 RGB images
Same 7 emotions (excluding “contempt”)

⚠️ Important:
Download the AffectNet dataset (subset) from this Kaggle link
Once downloaded, extract and place it inside your working directory like below and delete the contempt folder from train and test subfolder:
emojify/
┗ affdata/
   ┣ train/
   ┗ test/

All images were resized to 48x48, normalized, and augmented to ensure training efficiency and model generalization.

Download the requirements using the below code

pip install -r requirements.txt

📁 Directory Files

This section outlines the functionality of each Python script in the project and the dataset it is based on.

🖥️ GUI Scripts (Real-Time Detection)

gui_base_cnn.py
→ Real-time facial emotion detection using the Base CNN model trained on FER2013.
gui_attn_cnn.py
→ Real-time detection using the Attention-enhanced CNN (SE blocks) model trained on FER2013.
gui_cnn_vit.py
→ Real-time detection using the CNN + Vision Transformer hybrid model trained on FER2013.

🧪 Training Scripts (FER2013 Dataset)

train_base_cnn.py
→ Trains a Base CNN model on the FER2013 dataset.
train_attn_cnn.py
→ Trains a CNN model with Squeeze-and-Excitation attention on FER2013.
train_cnn_vit.py
→ Trains a CNN + Vision Transformer hybrid model on FER2013.

🧪 Training Scripts (AffectNet Dataset)

train2_base_cnn.py
→ Trains a Base CNN model on the AffectNet dataset.
train2_cnn_attn.py
→ Trains a CNN model with attention layers (multi-head attention) on AffectNet.
train2_cnn_vit.py
→ Trains a CNN + Vision Transformer hybrid model on AffectNet.

🧠 Model Architectures

1. Baseline CNN

3 convolutional layers + max pooling
Dense layer (1024 units) + Softmax
~5M parameters

2. Attention CNN (SE-CNN)

Adds Squeeze-and-Excitation (SE) blocks
Emphasizes important facial features
~6.2M parameters

3. CNN + Vision Transformer (ViT)

CNN extracts local features
Transformer captures global context
~9M parameters

⚙️ Training Details

Optimizer: Adam (lr=0.0001, decay=1e-6)
Epochs: 75
Batch Size: 64
Class weights: Based on inverse class frequencies
Loss Function: Focal loss (to handle class imbalance)

🎥 Real-Time Deployment

Face Detection: OpenCV Haar Cascade / DNN
Inference Pipeline:
1. Capture webcam frame
2. Detect face
3. Preprocess (resize to 48x48 grayscale)
4. Predict emotion
5. Overlay corresponding emoji on the frame
Performance:
- Baseline CNN and SE-CNN run smoothly in real-time.
- CNN+ViT performs well but with slight lag.

📊 Results Summary

Dataset	Model	Accuracy	Validation Accuracy
FER2013	Base CNN	52.66%	58.87%
FER2013	Attention CNN	51.35%	57.97%
FER2013	CNN+ViT	52.41%	55.60%
AffectNet	Base CNN	44.99%	48.29%
AffectNet	Attention CNN	34.67%	39.83%
AffectNet	CNN+ViT	40.22%	38.85%

💡 Conclusion

The Baseline CNN provided the best trade-off between accuracy and efficiency for real-time deployment.
The SE-CNN added interpretability by focusing on key facial regions.
The CNN+ViT hybrid showed robustness but was computationally more intensive.

🔭 Future Work

Explore lightweight models like MobileNet and EfficientNet for edge deployment.
Implement multimodal emotion recognition combining facial expressions with voice or body language.
Expand dataset diversity to improve cross-population generalization.

🧑‍💻 Authors

Nithish Gowda H N - Btech(Hons.) CSE, AI & ML Major
Prajna - Btech(Hons.) CSE, Cloud & Full Stack Major
Pratham Rajesh Vernekar - Btech(Hons.) CSE, Cloud & Full Stack Major
Nandan Kumar - Btech(Hons.) CSE, Cloud & Full Stack Major)

🎭 Emotion Recognition Demo

Real-time inference showing the model identifying facial expressions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Emojify: Real-Time Facial Emotion Detection and Emoji Reactions

🔍 Abstract

📂 Datasets Used

FER2013

AffectNet (subset)

Download the requirements using the below code

📁 Directory Files

🖥️ GUI Scripts (Real-Time Detection)

🧪 Training Scripts (FER2013 Dataset)

🧪 Training Scripts (AffectNet Dataset)

🧠 Model Architectures

1. Baseline CNN

2. Attention CNN (SE-CNN)

3. CNN + Vision Transformer (ViT)

⚙️ Training Details

🎥 Real-Time Deployment

📊 Results Summary

💡 Conclusion

🔭 Future Work

🧑‍💻 Authors

🎭 Emotion Recognition Demo

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
emojis		emojis
images		images
README.md		README.md
gui_attn_cnn.py		gui_attn_cnn.py
gui_base_cnn.py		gui_base_cnn.py
gui_cnn_vit.py		gui_cnn_vit.py
requirements.txt		requirements.txt
train2_base_cnn.py		train2_base_cnn.py
train2_cnn_attn.py		train2_cnn_attn.py
train2_cnn_vit.py		train2_cnn_vit.py
train_attn_cnn.py		train_attn_cnn.py
train_base_cnn.py		train_base_cnn.py
train_cnn_vit.py		train_cnn_vit.py

nghn0/Emojify

Folders and files

Latest commit

History

Repository files navigation

Emojify: Real-Time Facial Emotion Detection and Emoji Reactions

🔍 Abstract

📂 Datasets Used

FER2013

AffectNet (subset)

Download the requirements using the below code

📁 Directory Files

🖥️ GUI Scripts (Real-Time Detection)

🧪 Training Scripts (FER2013 Dataset)

🧪 Training Scripts (AffectNet Dataset)

🧠 Model Architectures

1. Baseline CNN

2. Attention CNN (SE-CNN)

3. CNN + Vision Transformer (ViT)

⚙️ Training Details

🎥 Real-Time Deployment

📊 Results Summary

💡 Conclusion

🔭 Future Work

🧑‍💻 Authors

🎭 Emotion Recognition Demo

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages