🖼️ Pydantic-AI MultiModal Image Processor

A demonstration project showcasing how to use MultiModal capabilities with Pydantic-AI to process and analyze images using OpenAI's LLM. This project specifically focuses on resume analysis and information extraction from images, Feel free to customize it as per your needs.

✨ Features

🔄 Image processing using OpenAI's LLM
📊 Structured data extraction using Pydantic models
🎨 Support for multiple image formats
📄 Resume information extraction including:
- 🔗 LinkedIn profile
- 💻 GitHub profile
- 📧 Email
- 💼 Work experience
- 🎓 Education
- 🛠️ Skills

📋 Prerequisites

🐍 Python 3.8+
🔑 OpenAI API key
📦 Git

🚀 Installation

Clone the repository:

git clone https://github.com/rawheel/Pydantic-ai-MultiModal-Example.git
cd Pydantic-ai-MultiModal-Example

Set up virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate

Environment variables:

LLM_MODEL=gpt-4o-mini
OPENAI_API_KEY=your_openai_api_key_here

📝 Usage

Basic usage web URLs:

from app import ImageSummarizer
# Initialize summarizer
summarizer = ImageSummarizer()
# Example image URLs
image_urls = [
    'https://example.com/path/to/image.jpg',
    # Add more image URLs as needed
]
# Run analysis
summary = summarizer.summarize(image_urls, "summarize the resume")
print(summary)

Run the example script:

python app.py

📁 Project Structure

├── app.py            # Main application file
├── config.py         # Configuration settings
├── requirements.txt  # Project dependencies
├── .env             # Environment variables (create this)
└── README.md        # Project documentation

⚙️ Configuration

The project uses environment variables for configuration. Available options:

LLM_MODEL: The OpenAI model to use (example: "gpt-4o-mini")
OPENAI_API_KEY: Your OpenAI API key

🤝 Contributing

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

👤 Author

Raheel Siddiqui

🌐 GitHub: @rawheel
💼 LinkedIn: Raheel Siddiqui

⭐️ If you find this project useful, please consider giving it a star!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🖼️ Pydantic-AI MultiModal Image Processor

✨ Features

📋 Prerequisites

🚀 Installation

📝 Usage

📁 Project Structure

⚙️ Configuration

🤝 Contributing

📄 License

🙏 Acknowledgments

👤 Author

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt

License

rawheel/Pydantic-ai-MultiModal-Example

Folders and files

Latest commit

History

Repository files navigation

🖼️ Pydantic-AI MultiModal Image Processor

✨ Features

📋 Prerequisites

🚀 Installation

📝 Usage

📁 Project Structure

⚙️ Configuration

🤝 Contributing

📄 License

🙏 Acknowledgments

👤 Author

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages