Source code for AAAI 2025 (Oral) paper Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit.
- Python version: 3.11.9
- Please download the E-EVQA and E-IC datasets from the URL provided in [1] and place the related folders in the
datadirectory. - Please modify the
ROOT_PATHinutils/GLOBAL.pyto the absolute path of the current directory, and updatemodel_path_mapto the absolute paths of each backbone's weights.
Please run contribution_module.py, using Jupyter Notebook would be better for display.
Please run contribution_visual_reps.py, using Jupyter Notebook would be better for display.
Please use the following script to train a VEAD:
python vead_train.py -mn llava -dna EVQA -bs 4 -dvc "cuda:0" -edvc 1
Please use the following script to test VEAD:
python vead_test.py -mn llava -dn EVQA -dvc "cuda:0" -ckpt [vead_checkpoint_path]
Please cite our paper if this work has inspired or assisted you :)
@inproceedings{DBLP:conf/aaai/Chen00HWL25,
author = {Qizhou Chen and
Taolin Zhang and
Chengyu Wang and
Xiaofeng He and
Dakan Wang and
Tingting Liu},
editor = {Toby Walsh and
Julie Shah and
Zico Kolter},
title = {Attribution Analysis Meets Model Editing: Advancing Knowledge Correction
in Vision Language Models with VisEdit},
booktitle = {AAAI-25, Sponsored by the Association for the Advancement of Artificial
Intelligence, February 25 - March 4, 2025, Philadelphia, PA, {USA}},
pages = {2168--2176},
publisher = {{AAAI} Press},
year = {2025},
url = {https://doi.org/10.1609/aaai.v39i2.32215},
doi = {10.1609/AAAI.V39I2.32215},
timestamp = {Thu, 17 Apr 2025 17:08:57 +0200},
biburl = {https://dblp.org/rec/conf/aaai/Chen00HWL25.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}[1] Can We Edit Multimodal Large Language Models?