Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
-
Updated
Aug 6, 2024 - Python
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
开放源码的无App推送服务,iOS14+扫码即用。亦支持快应用/iOS和Mac客户端、Android客户端、自制设备
Effortless data labeling with AI support from Segment Anything and other awesome models.
OpenMMLab Pre-training Toolbox and Benchmark
中文nlp解决方案(大模型、数据、模型、训练、推理)
Collection of AWESOME vision-language models for vision tasks
Easily compute clip embeddings and build a clip retrieval system with them
Android UI 快速开发,专治原生控件各种不服
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and 🔜 video, up to 5x faster than OpenAI CLIP and LLaVA 🖼️ & 🖋️
Search photos on Unsplash using natural language
Search inside YouTube videos using natural language
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
Add a description, image, and links to the clip topic page so that developers can more easily learn about it.
To associate your repository with the clip topic, visit your repo's landing page and select "manage topics."