CaptainCaption: GPT-4-Vision Based Image Caption Generator

A gradio based image captioning tool that uses the GPT-4-Vision API to generate detailed descriptions of images.

Features

Prompt Engineering: Customize the prompt for image description to get the most accurate and relevant captions.
Batch Processing: Ability to process an entire folder of images with customized pre and post prompts.

Screenshot

Installation

Clone repository

git clone https://github.com/42lux/CaptainCaption

Install requirements

pip install -r requirements.txt

Usage

Setting Up API Key: Enter your OpenAI API key in the provided textbox.
Uploading Images: In the "Prompt Engineering" tab, upload the image for which you need a caption.
Customizing the Prompt: Customize the prompt, detail level, and max tokens according to your requirements.
Generating Captions: Click on "Generate Caption" to receive the image description.
Batch Processing: In the "GPT4-Vision Tagging" tab, you can process an entire folder of images. Set the folder path, prompt details, and the number of workers for processing.

Running the Application

Run the script and navigate to the provided URL (Standard http://127.0.0.1:7860) by Gradio to access the interface.

Limitations and Considerations

The accuracy of captions depends on the quality of the uploaded images and the clarity of the provided prompts.
The OpenAI API is rate-limited, so consider this when processing large batches of images.
Internet connectivity is required for API communication.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
LICENSE		LICENSE
main.py		main.py
rate_limiter.py		rate_limiter.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CaptainCaption: GPT-4-Vision Based Image Caption Generator

Features

Screenshot

Installation

Usage

Running the Application

Limitations and Considerations

About

Contributors 2

Languages

License

42lux/CaptainCaption

Folders and files

Latest commit

History

Repository files navigation

CaptainCaption: GPT-4-Vision Based Image Caption Generator

Features

Screenshot

Installation

Usage

Running the Application

Limitations and Considerations

About

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

Languages