Vision-Guided Direct Preference Optimization

This repository contains code and analysis for the paper: V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization. Below is the framework of our proposed method (including data collection and preference learning).

Environment Setup

conda env create --file llava_dpo.yaml

Run V-DPO

Our main code include ./llava_dpo/train/dpo_train.py and ./llava_dpo/train/llava_trainer.py

To run V-DPO on RLHF-V dataset:

bash scripts/v1_5/vdpo.sh

Citation

@article{xie2024monte,
  title={V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization},
  author={Xie, Yuxi and Li, Guanzhen and Xu, Xiao and Kan, Min-Yen},
  year={2024}
}

_{^{This repository is adapted from the code of the works LLaVA.}}

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
llava_dpo		llava_dpo
playground/data		playground/data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
llava.md		llava.md
llava_dpo.yaml		llava_dpo.yaml
pipeline.png		pipeline.png
predict.py		predict.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision-Guided Direct Preference Optimization

Environment Setup

Run V-DPO

Citation

About

Releases

Packages

Languages

License

YuxiXie/V-DPO

Folders and files

Latest commit

History

Repository files navigation

Vision-Guided Direct Preference Optimization

Environment Setup

Run V-DPO

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages