From 1caff9c7fbef7ca840c4100d3b82a2dd429eacc2 Mon Sep 17 00:00:00 2001
From: Junnan Li <junnan.li@salesforce.com>
Date: Tue, 16 May 2023 13:33:38 +0800
Subject: [PATCH] Update README.md

---
 README.md | 3 +++
 1 file changed, 3 insertions(+)
diff --git a/README.md b/README.md
index 72e901113..874da99c1 100644
--- a/README.md
+++ b/README.md
@@ -27,6 +27,9 @@
 # LAVIS - A Library for Language-Vision Intelligence
 
 ## What's New: 🎉 
+  * [Model Release] May 2023, released implementation of **InstructBLIP** <br>
+  [Paper](https://arxiv.org/abs/2305.06500), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/instructblip)    
+  > A new vision-language instruction-tuning framework using BLIP-2 models, achieving state-of-the-art zero-shot generalization performance on a wide range of vision-language tasks.
   * [Model Release] Jan 2023, released implementation of **BLIP-2** <br>
   [Paper](https://arxiv.org/abs/2301.12597), [Project Page](https://github.com/salesforce/LAVIS/tree/main/projects/blip2), [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/salesforce/LAVIS/blob/main/examples/blip2_instructed_generation.ipynb)
   > A generic and efficient pre-training strategy that easily harvests development of pretrained vision models and large language models (LLMs) for vision-language pretraining. BLIP-2 beats Flamingo on zero-shot VQAv2 (**65.0** vs **56.3**), establishing new state-of-the-art on zero-shot captioning (on NoCaps **121.6** CIDEr score vs previous best **113.2**). In addition, equipped with powerful LLMs (e.g. OPT, FlanT5), BLIP-2 also unlocks the new **zero-shot instructed vision-to-language generation** capabilities for various interesting applications!