Skip to content

Latest commit

 

History

History
160 lines (105 loc) · 6.49 KB

README.md

File metadata and controls

160 lines (105 loc) · 6.49 KB

InsightLens 🖼️🔍

InsightLens

Overview

InsightLens is an AI-powered image analysis tool designed to deliver quick, insightful, and contextually accurate information about images. Powered by Google Generative AI (Gemini), Streamlit, and various AI libraries, InsightLens generates captions, offers detailed descriptions with emojis, and allows users to ask questions about image content.

Whether you're using InsightLens to enhance content creation, explore visual storytelling, or analyze images for insights, this tool provides a seamless and interactive experience that’s both informative and engaging.

Live Demo

Experience InsightLens in action! 👉🏻 Try InsightLens! 🌟


Explore the story behind every image with InsightLens!

InsightLens Demo


Table of Contents

  1. Features
  2. How It Works
  3. Installation
  4. Usage
  5. Technologies Used
  6. Results
  7. Conclusion
  8. Future Enhancements
  9. License
  10. Contact

Features🌟

  • Automatic Captioning: Generates a brief, one-line caption for uploaded images.
  • Detailed Descriptions: Provides concise summeries that highlight the primary content and context of the image.
  • Image Q&A: Users can ask questions about the image's content, with responses powered by Gemini AI.
  • Interactive User Interface: InsightLens is designed with animations, style effects, and balloons for a lively experience.
  • Privacy by Design: InsightLens does not store any images or questions asked, ensuring a secure and private interaction every time.

How It Works🧠

  1. Upload an Image: The user uploads an image in .jpg, .jpeg, or .png format.
  2. Automatic Captioning: InsightLens auto-generates a caption using Gemini AI.
  3. Detailed Summaries: A structured, emoji-enhanced description provides a deeper understanding of the image.
  4. Interactive Q&A: Users can ask specific questions about the image, and the AI responds with insightful answers.
  5. Visual Enhancements: InsightLens offers a polished user experience with glowing titles, fade-in effects, and celebratory balloons upon successful interactions.

Installation🛠

  1. Clone the repository:

    git clone https://github.com/hk-kumawat/InsightLens.git
  2. Install dependencies:

    pip install -r requirements.txt
  3. Setup environment variables:

    • Obtain an API key from Google Generative AI.
    • Create a .env file in the root directory and add:
      GEMINI_API_KEY=your_gemini_api_key
    • Replace your_gemini_api_key with your actual Gemini API Key.

Usage🚀

  1. Run the Streamlit App:

    streamlit run app.py
  2. Upload and Explore:

    • Upload an image and instantly receive a caption, detailed description, and engage in Q&A about the image.

Technologies Used💻

  • Programming Language: Python

  • Libraries:

    • streamlit — For creating the user interface.
    • Pillow — For image processing.
    • python-dotenv — Manages environment variables.
    • google-generativeai — For generating captions, descriptions, and answering questions.
  • API:

    • Gemini API by Google Generative AI — Powers the core captioning, description generation, and Q&A functionalities.

Results🏆

InsightLens successfully analyzes images, providing an insightful, one-line caption along with a structured, emoji-based description and engaging Q&A responses. This AI-powered analysis is useful in applications ranging from social media to education and creative design.

InsightLens Example

In the above example, InsightLens provides a brief caption, structured description, and accurate answers to user questions about the image content.


Conclusion📚

The InsightLens project exemplifies the potential of AI in image analysis, creating a valuable tool for visually rich applications. By integrating Google Generative AI (Gemini) and Streamlit, InsightLens enables users to explore image content in a meaningful and interactive way.

Real-world applications for InsightLens include content creation (captions and descriptions for social media), education (visual storytelling) and research (image-based inquiries). This makes InsightLens a versatile tool for both personal and professional uses.


Future Enhancements🚀

  1. Multi-turn Conversation: Enable the assistant to maintain conversation context across multiple interactions.
  2. Advanced Emotion Detection: Expand sentiment capabilities to identify a wider range of emotional tones in image context.
  3. Integration with External Services: Extend InsightLens’s functionality to connect with APIs for additional insights (e.g., related news or facts about image objects).
  4. Voice Interaction: Add voice input/output for a more dynamic user experience.

License📝

This project is licensed under the MIT License — see the LICENSE file for details.


Contact

📬 Get in Touch!

Feel free to reach out for collaborations or questions:

  • GitHub 💻 — Explore more projects by Harshal Kumawat.
  • LinkedIn 🌐 — Let's connect professionally.
  • Email 📧 — Reach out for inquiries or collaboration.

Where visuals meet intelligence—thank you for discovering what InsightLens can do! Let's keep exploring new horizons together. 🌍🔍

"Every image has a story; let InsightLens help you discover it, one image at a time." - Harshal Kumawat