Skip to content

Rupa-Veerala/Context-Based-Image-Captioning

Repository files navigation

CONTEXT-BASED IMAGE CAPTIONING

This project is a Context-Based Image Captioning website that leverages computer vision and natural language processing techniques to generate context-aware captions for images. The captions are enriched by user-provided contextual inputs, making them more meaningful and specific.

Features

User-Interactive Website:

Upload images and provide contextual text inputs. Real-time caption generation using advanced deep learning models.

Multi-Model Comparison:

Four models were evaluated for feature extraction: Xception, ResNet, VGG16, and EfficientNet. The Xception model performed the best, achieving the highest BLEU score (~60%).

Context Integration:

Allows users to include additional contextual information to enhance the captions generated for uploaded images.

Website Demo

The website consists of the following components:

Home Page:

  1. Brief introduction to the project.
  2. Option to upload an image.

Image Upload & Captioning:

  1. Upload an image and provide additional contextual text.
  2. Click on "Generate Caption" to receive a caption based on the image and context.

Model Insights:

Comparison of the four models used in the project, highlighting their performance metrics and visual examples.

Future Work

Further optimization of the model to improve BLEU scores.

Expand the system to support multiple languages for caption generation.

Explore additional datasets for more diverse captioning capabilities.