From 1caff9c7fbef7ca840c4100d3b82a2dd429eacc2 Mon Sep 17 00:00:00 2001 From: Junnan Li Date: Tue, 16 May 2023 13:33:38 +0800 Subject: [PATCH] Update README.md --- README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/README.md b/README.md index 72e901113..874da99c1 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,9 @@ # LAVIS - A Library for Language-Vision Intelligence ## What's New: 🎉 + * [Model Release] May 2023, released implementation of **InstructBLIP**
+ [Paper](https://arxiv.org/abs/2305.06500), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/instructblip) + > A new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks. * [Model Release] Jan 2023, released implementation of **BLIP-2**
[Paper](https://arxiv.org/abs/2301.12597), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/blip2), [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/examples/blip2_instructed_generation.ipynb) > A generic and efficient pre-training strategy that easily harvests development of pretrained vision models and large language models (LLMs) for vision-language pretraining. BLIP-2 beats Flamingo on zero-shot VQAv2 (**65.0** vs **56.3**), establishing new state-of-the-art on zero-shot captioning (on NoCaps **121.6** CIDEr score vs previous best **113.2**). In addition, equipped with powerful LLMs (e.g. OPT, FlanT5), BLIP-2 also unlocks the new **zero-shot instructed vision-to-language generation** capabilities for various interesting applications!