Step 1: Prepare the dataset. Download from here. We use the setup from Mao et al. 2019. See the original instruction here: GitHub Repo.
To replicate the experiments, you need to prepare your dataset as the following.
clevr
├── train
│ ├── images
│ ├── questions.json
│ ├── scenes-raw.json
│ └── scenes.json
│ └── vocab.json
└── val
├── images
├── questions.json
├── scenes-raw.json
└── scenes.json
└── vocab.json
You can download all images, and put them under the images/
folders from the official website of the CLEVR dataset.
The questions.json
and scenes-raw.json
could also been found on the website.
Next, you need to add object detection results for scenes. Here, we use the tools provided by ns-vqa.
In short, a pre-trained Mask-RCNN is used to detect all objects. We provide the scenes.json
files with detected object bounding boxes at this download link. The vocab.json
file can be downloaded at this link as well.
Step 2: Generate groundtruth programs for CLEVR/train and CLEVR/val.
jac-run scripts/gen-clevr-gt-program.py --input data/clevr/train/questions.json --output data/clevr/train/questions-ncprogram-gt.pkl
jac-run scripts/gen-clevr-gt-program.py --input data/clevr/val/questions.json --output data/clevr/val/questions-ncprogram-gt.pkl
Step 3: Training (10% Data Efficiency).
You will need Cython to compile some libraries. Please install Cython before you run the training commands.
pip install Cython
jac-crun 0 scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr/train \
--data-parses data/clevr/train/questions-ncprogram-gt.pkl data/clevr/val/questions-ncprogram-gt.pkl \
--curriculum all --expr original --validation-interval 5 --config model.learned_belong_fusion=plus --data-tvsplit 0.95 --data-retain 0.1
Step 4: Training (100% Data Efficiency).
jac-crun 0 scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr/train \
--data-parses data/clevr/train/questions-ncprogram-gt.pkl data/clevr/val/questions-ncprogram-gt.pkl \
--curriculum all --expr original --validation-interval 5 --config model.learned_belong_fusion=plus --data-tvsplit 0.95
Step 5: Evaluation.
jac-crun 0 scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr/train \
--data-parses data/clevr/train/questions-ncprogram-gt.pkl data/clevr/val/questions-ncprogram-gt.pkl \
--curriculum all --expr original --validation-interval 5 --config model.learned_belong_fusion=plus --data-tvsplit 0.95 \
--load <TRAINED_CHECKPOINT_FILE> \
--validation-data-dir data/clevr/val --evaluate
jac-crun 0 scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr-humans/train --data-parses clevr_humans_gpt35.pkl \
--curriculum none --expr human --data-tvsplit 0.95 --validation-interval 5 --config model.learned_belong_fusion=plus \
--load <TRAINED_CHECKPOINT_FILE>
Here <TRAINED_CHECKPOINT_FILE> should be replaced by the trained checkpoint file on the original CLEVR. We have provided the --data-parses
file inside data/clevr-parsings
.
To generate those data-parses files, run the commands inside the prompts/ directory. You need to install openai pip install openai
before running the command.
jac-run run-gpt35-prompt.py --dataset clevr --questions <PATH_TO_QUESTIONS> --output prompt-clevr-humans.pkl --prompt prompts-clevr.txt
scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr-mini --data-parses questions-ncprogram-gt.pkl transfer-questions-ncprogram-gt.json \
--expr transfer --config model.learned_belong_fusion=plus \
--load <TRAINED_CHECKPOINT_FILE> \
--evaluate-custom ref --data-questions-json refexps-20230513.json
Note that here we use the CLEVR-Mini dataset from NS-VQA, as we need to have the groundtruth set of objects.
You can also generate your own dataset using the code scripts/gen-clevr-ref.py
This script uses the groundtruth logic programs. To use the program generated by GPT4, use the files inside data/clevr-parsings
.
scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr-mini --data-parses questions-ncprogram-gt.pkl transfer-questions-ncprogram-gt.json \
--expr transfer --config model.learned_belong_fusion=plus \
--load <TRAINED_CHECKPOINT_FILE> \
--evaluate-custom puzzle --data-questions-json puzzle-20230513.json
You can also generate your own dataset using the code scripts/gen-clevr-puzzle.py
This script uses the groundtruth logic programs. To use the program generated by GPT4, use the files inside data/clevr-parsings
.
scripts/trainval-clevr.py --desc experiments/desc_neuro_codex_clevr_learned_belongings.py \
--data-dir data/clevr-mini --data-parses questions-ncprogram-gt.pkl transfer-questions-ncprogram-gt.json \
--expr transfer --config model.learned_belong_fusion=plus \
--load <TRAINED_CHECKPOINT_FILE> \
--evaluate-custom rpm --data-questions-json rpm-20230513.json
You can also generate your own dataset using the code scripts/gen-clevr-rpm.py
. This script uses the the groundtruth logic programs.
To use the program generated by GPT4, use the files inside data/clevr-parsings
.
To generate those data-parses files, run the commands inside the prompts/ directory. You need to install openai pip install openai
before running the command.
jac-run run-gpt4-prompt.py --dataset clevr-puzzles --questions <PATH_TO>/puzzle-20230513.json --output clevr_transfer_puzzle_gpt4.pkl --prompt prompts-clevr-transfer.txt
jac-run run-gpt4-prompt.py --dataset clevr-refexps --questions <PATH_TO>/refexps-20230513.json --output clevr_transfer_ref_gpt4.pkl --prompt prompts-clevr-transfer.txt
jac-run run-gpt4-prompt.py --dataset clevr-rpms --questions <PATH_TO>/rpm-20230513.json --output clevr_transfer_rpm_gpt4.pkl --prompt prompts-clevr-transfer.txt