A simple open-sourced SigLIP model finetuned on Genshin Impact's image-text pairs.
-
Updated
Oct 9, 2024
A simple open-sourced SigLIP model finetuned on Genshin Impact's image-text pairs.
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Clipora is a powerful toolkit for fine-tuning OpenCLIP models using Low Rank Adapters (LoRA).
Using Docker compose to start up Triton with the OpenClip model, to encode text in to vectors
Text-to-image search with OpenCLIP, Docker, Flask, Faiss, etc. and a basic front-end.
[Official] [IROS 2024] A goal-oriented planning to lift VLN performance for Closed-Loop Navigation: Simple, Yet Effective
Searching Images: From Clip And Beyond
Reproducible scaling laws for contrastive language-image learning (https://arxiv.org/abs/2212.07143)
Group images by provided labels using OpenAI/CLIP
CLIP based Zero Shot Instance Segmentation
use SAM and OpenCLIP to perform zero-shot object detection using COCO 2017 val split.
Using Segment-Anything and CLIP to generate pixel-aligned semantic features.
Add a description, image, and links to the openclip topic page so that developers can more easily learn about it.
To associate your repository with the openclip topic, visit your repo's landing page and select "manage topics."