Skip to content

YuxiXie/V-DPO

Repository files navigation

Vision-Guided Direct Preference Optimization

This repository contains code and analysis for the paper: V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization. Below is the framework of our proposed method (including data collection and preference learning).

Model Framework

Environment Setup

conda env create --file llava_dpo.yaml

Run V-DPO

Our main code include ./llava_dpo/train/dpo_train.py and ./llava_dpo/train/llava_trainer.py

To run V-DPO on RLHF-V dataset:

bash scripts/v1_5/vdpo.sh

Citation

@article{xie2024monte,
  title={V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization},
  author={Xie, Yuxi and Li, Guanzhen and Xu, Xiao and Kan, Min-Yen},
  year={2024}
}

This repository is adapted from the code of the works LLaVA.

About

Preference Learning for LLaVA

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published