Skip to content

This repository contains all resources and documentation related to the Human Action Recognition project. The goal of this project is to classify different human actions using deep learning models trained on the Human Action Recognition (HAR) dataset.

Notifications You must be signed in to change notification settings

RahmanToufik/Deep-Learning-for-Human-Action-Recognition

Repository files navigation

Abstract

The study leverages the Human Action Recognition (HAR) dataset, which can be accessed here. Alongside this documentation, the trained models and associated Jupyter Notebook (.ipynb) files have been attached for reproducibility and further exploration.

Deep Learning for Human Action Recognition

Objective

The goal of this project is to develop a Convolutional Neural Network (CNN) model to classify human activities from images. The model is trained to recognize and label activities from 15 predefined categories based on visual content.

Dataset

The dataset comprises over 12,000 labeled images categorized into 15 human activity classes:

  • Calling
  • Clapping
  • Cycling
  • Dancing
  • Drinking
  • Eating
  • Fighting
  • Hugging
  • Laughing
  • Listening to Music
  • Running
  • Sitting
  • Sleeping
  • Texting
  • Using Laptop

Each class contains 840 training images, and the dataset is balanced with no missing or duplicate values.

Here are some sample images from the dataset: Sample Image

Methodology

1. Data Exploration

  • Inspect dataset structure (e.g., number of classes, number of images per class).
  • Visualize sample images from each class.
  • Check image dimensions and label distributions.

2. Data Preprocessing

  • Resize images to 128x128 pixels.
  • Normalize pixel values to [0,1].
  • Encode labels and split data into training (80%) and validation (20%) sets.
  • Apply data augmentation (e.g., rotations, zoom, flips).
  • Preprocess test data similarly.

3. CNN Model Design

  • Input Layer: (128, 128, 3).
  • Convolutional Blocks:
    • 32, 64, 128, 256 filters with (3x3) kernels, ReLU activation, and max-pooling.
  • Dense Layers:
    • 512 and 256 neurons with ReLU and dropout (rate: 0.5).
  • Output Layer: Softmax activation for 15 classes.
  • Compilation: Adam optimizer (learning rate: 0.0001), sparse categorical cross-entropy loss, and accuracy metric.

4. VGG16-Based Model Design

  • Feature Extractor: Pre-trained VGG16 (ImageNet weights, frozen base).
  • Dense Layers:
    • Fully connected layers with 512 neurons and ReLU activation.
  • Output Layer: Softmax for classification.
  • Fine-tuned the model on the dataset.

5. Model Training and Evaluation

  • Trained CNN and VGG16 models on the processed dataset.
  • Validation metrics were monitored for overfitting.

Results

The performance of the models is summarized in the table below:

Metric CNN Model VGG Model
Training Accuracy 79.44% 81.17%
Validation Accuracy 54.05% 46.91%
Final Training Loss 0.634 0.548
Validation Loss 1.835 2.560
Test Accuracy (from Contest) 56.53% 48.29%
Practice Rank 8th N/A

Usage

  1. Clone this repository.
  2. Load the ipynb files in Jupyter Notebook or Google Colab.
  3. Ensure you have downloaded the HAR dataset from the provided link.
  4. Follow the code to train or evaluate the models.

Project Structure

├── images
│   ├── sample_image.png
├── best_cnn_model.keras
├── best_vgg_model.keras
├── cnn_model_test.csv
├── vgg_model_test.csv
├── README.md
└── har-detection-testing-accuracy-56.ipynb

Acknowledgments

This project demonstrates the application of deep learning techniques in human activity recognition using image data, highlighting both challenges and potential improvements for practical deployment.

About

This repository contains all resources and documentation related to the Human Action Recognition project. The goal of this project is to classify different human actions using deep learning models trained on the Human Action Recognition (HAR) dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published