Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to prepare eval dataset. #57

Open
linxid opened this issue Jun 4, 2024 · 4 comments
Open

How to prepare eval dataset. #57

linxid opened this issue Jun 4, 2024 · 4 comments

Comments

@linxid
Copy link

linxid commented Jun 4, 2024

image This is a amazing work. I try to evaluate model performance. In Video-ChatGPT dataset, I only find these data. How can I get test_q.json and test_a.json dataset.
@Caixin89
Copy link

Caixin89 commented Jun 7, 2024

I downloaded both the QA pairs and videos from https://mbzuai-oryx.github.io/Video-ChatGPT/
Then, create a folder structure within DATAS as follows

DATAS/
└── VCGBench/
    ├── Videos/
    │   └── Benchmarking
    └── Zero_Shot_QA

Then, place the QA json files in Zero_Shot_QA/ and place the videos in Benchmarking/.

I deduced the structure from code at https://github.com/magic-research/PLLaVA/blob/main/tasks/eval/vcgbench/__init__.py#L272-L301

@ermu2001
Copy link
Collaborator

ermu2001 commented Jun 7, 2024

I've uploaded the evaluation data here: https://huggingface.co/datasets/ermu2001/PLLaVATesting/tree/main/DATAS

You can follow the instructions here on the dev branch to prepare this data directly. Also, I recommend switching to dev as we fixed some bug there.

@linxid
Copy link
Author

linxid commented Jun 28, 2024

thanks

@hb-jw
Copy link

hb-jw commented Aug 1, 2024

Hello! I would also like to test the benchmark for video_chatgpt, but it need GPT assistance. Could you tell me approximately how much it costs to test the video_chatgpt benchmark once?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants