mutimodal

"A private, local OCR solution using Meta's Llama 3.2 Vision model with a Streamlit interface. Processes images entirely offline, supporting formats like JPEG, PNG, and BMP.

open-source ocr streamlit mutimodal llm meta-ai ollama llama-3-2-vision local-ocr

Updated Nov 21, 2024
Python

anusha-chebolu / multimodal-rag

Star

A multimodal RAG application using Qwen 2.5 VL, ColPali, and QdrantDB for text and image-based retrieval.

rag mutimodal qdrant-vector-database colpali qwen2-vl

Updated Mar 20, 2025
Jupyter Notebook

ashutoshkr45 / QD-RetNet

Star

QD-RetNet: Efficient Retinal Disease Classification via Quantized Knowledge Distillation [MIUA-2025]

knowledge-distillation quantization-aware-training retinal-disease-detection mutimodal

Updated Jul 20, 2025
Python

gogotalk / furkids-ai-confounder-recruitment

Star

Furkids AI 招募儲備技術合夥人｜Decode the silent language of pets, build the world’s leading multimodal intelligence system 🐾🚀

opencv computer-vision deep-learning pytorch health-tech emotion-recognition distributed-training aws-deployment mutimodal animal-behavior-modelling emotion-ai ai-startup pet-tech ai-cofounder startup-equtity founder-track taiwan-startup

Updated Feb 16, 2026

Improve this page

Add a description, image, and links to the mutimodal topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mutimodal topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mutimodal

Here are 9 public repositories matching this topic...

duyu09 / MKTY-System

video-db / videodb-node

johnnyhank / MIRA-Multimodal-Intelligent-Robotic-Assistant

rekkles2 / Gaze-CIFAR-10

kingabzpro / Gemini-2-Pro-Chat

dwain-barnes / llama3.2-vision-ocr-streamlit

anusha-chebolu / multimodal-rag

ashutoshkr45 / QD-RetNet

gogotalk / furkids-ai-confounder-recruitment

Improve this page

Add this topic to your repo