Hi! I'm a 3rd-year CS Ph.D at University of Maryland, College Park, working with Abhinav Shrivastava and Yaser Yacoob.
Pinned Loading
-
LRV-Instruction
LRV-Instruction Public[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
-
VisualNews-Repository
VisualNews-Repository Public[EMNLP'21] Visual News: Benchmark and Challenges in News Image Captioning
-
NVlabs/EAGLE
NVlabs/EAGLE PublicEAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
-
tianyi-lab/HallusionBench
tianyi-lab/HallusionBench Public[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
-
DocumentCLIP
DocumentCLIP Public[ICPRAI 2024] DocumentCLIP: Linking Figures and Main Body Text in Reflowed Documents
Python 17
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.