Code for ALBEF: a new vision-language pre-training method
-
Updated
Sep 20, 2022 - Python
Code for ALBEF: a new vision-language pre-training method
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Data release for the ImageInWords (IIW) paper.
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
The largest multilingual image-text classification dataset. It contains fashion products.
Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment
A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
Wrapper for PHP's GD Library for easy image manipulation. Support for scaling multi-line text, shapes, filters and smart resize.
WWDC22: Enabling Live Text interactions with images in SwiftUI
A server powering LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.
Download flickr8k, flickr30k image caption datasets
An Interactive Game-based Vision Planning benchmark
Contrastive Learning Representations for Images and Text Pairs. Colab implementation of ConVIRT for transfer learning with insufficient data volume.
This project is a FastAPI-based web application designed to analyze C a m b r i d g e I E L T S P D F s ( B o o k s 1 − 18 ) for the most and least repeated words. It can handle both regular text-based PDFs and scanned image-based PDFs by converting them to images and extracting text using OCR (Optical Character Recognition).
Image Captioning With MobileNet-LLaMA 3
caption generator using lavis and argostranslate
Add a description, image, and links to the image-text topic page so that developers can more easily learn about it.
To associate your repository with the image-text topic, visit your repo's landing page and select "manage topics."