Skip to content

Official repo of the paper "Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models"

License

Notifications You must be signed in to change notification settings

andrewliao11/Q-Spatial-Bench-code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Helper code to evaluate Q-Spatial Bench

Q-Spatial Bench is a benchmark designed to measure the quantitative spatial reasoning 📏 in large vision-language models.

🔥 The paper associated with Q-Spatial Bench is accepted by EMNLP 2024 main track!

  • Our paper: Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models [arXiv link]
  • Project website: [link]

Usage

Dataset Download

Download the dataset from HuggingFace Hub

from datasets import load_dataset
dataset = load_dataset("andrewliao11/Q-Spatial-Bench")

The dataset object has the following structure:

DatasetDict({
    QSpatial_plus: Dataset({
        features: ['question', 'answer_value', 'answer_unit', 'question_type', 'image_path', 'image'],
        num_rows: 101
    })
    QSpatial_scannet: Dataset({
        features: ['question', 'answer_value', 'answer_unit', 'question_type', 'image_path', 'image'],
        num_rows: 170
    })
})

For QSpatial_scannet

You need to manually download them from ScanNet. To access the images in ScanNet, one needs to request the permission at here. Once you have the permission, you will get the instructions via email. Specifically, in the email, you have have the access to a python file named download-scannet.py.

Once you have download-scannet.py, run the following code to download the images used in QSpatial-ScanNet

mv download-scannet.py <REPO_ROOT>/QSpatial_scannet
cd <REPO_ROOT>/QSpatial_scannet
python download_and_render_scannet_images.py

Iterate over the Dataset

We provide an example ipython notebook under examples/iterate_over_dataset.ipynb

Evaluation

We provide an example ipython notebook under examples/evaluate_success_rate.ipynb

Citation

@misc{liao2024reasoningpathsreferenceobjects,
      title={Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models}, 
      author={Yuan-Hong Liao and Rafid Mahmood and Sanja Fidler and David Acuna},
      year={2024},
      eprint={2409.09788},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2409.09788}, 
}

Feel free to reach out to Yuan-Hong Liao andrew@cs.toronto.edu for any questions.

About

Official repo of the paper "Reasoning Paths with Reference Objects Elicit Quantitative Spatial Reasoning in Large Vision-Language Models"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages