This repo houses the official PyTorch implementation for the following paper
- Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Shih-Lun Wu, Yi-Hui Chou, and Liangze Li
Annual Meeting of the Association of Computational Linguistics (ACL) 2023
[ArXiv]
conda create -n photobook python=3.8.10
conda activate photobook
pip install -r requirements.txt
python
>>> import nltk
>>> nltk.download('punkt')
-
Read
../data/data_splits.json
and save processed log data to../data/{split}_sections.pickle
cd preprocess python dialogue_segmentation.py
-
Generate CLIP score
-
Read
../data/{split}_sections.pickle
and save data to../data/{split}_clean_sections.pickle
python process_section.py
- Extract image features with Segformer
-
Save features at
../data/image_feats.pickle
, the saved data is a dictionary (key: image path, value: hidden features)python process_image.py
-
Edit hyperparams in
model/variables.py
-
Training (with the best-performing configuration)
python3 train.py config_paper/EXPERIMENT_JSON exp/EXPERIMENT_NAME
-
To reproduce logged ablations, use different
EXPERIMENT_JSON
for each run.- Ours -
vlscore_all.json
-
- VisAttn -
vlscore_visattn.json
- VisAttn -
-
- CLIPScore -
base_deberta.json
- CLIPScore -
-
- CLIPScore + VisAttn -
visattn.json
- CLIPScore + VisAttn -
-
- Dense learning signals -
vlscore_all.json
(changeDLS
toFalse
inmodel/variables.py
)
- Dense learning signals -
- Ours -
-
Tweak random seeds for optimal performance.
-
Inference
python3 inference.py exp/EXPERIMENT_NAME
Baseline Model Adapted from Takmaz et al., 2020
-
Model implementation is based on official PhotoBook repo
-
To run the Takmaz baseline
cd takmaz_baseline/ python3 train.py
-
To use reference chains extracted using CLIPScore
- set
REF_CHAIN_PATH = ref_chain_img_clipscore.pickle
intakmaz_baseline/variables.py
- set
-
Note that different random seeds might be needed in
takmaz_baseline/variables.py
for optimal results in different experiments.
-
This part is largely inherited from official PhotoBook repo, except that we add the option to use CLIPScore as the scoring metric.
-
To reproduce the whole extraction and evaluation procedure described in (Takmaz et al., 2020), run these commands in
chain-extraction
.python src/extract_segments.py out/all_segments.dict --stopwords --meteor --from_first_common --utterances_as_captions python src/make_chains.py out/all_segments.dict out/all_chains.json --score f1 python src/make_gold_chains.py out/gold_chains.json --from_first_common --first_reference_only python src/make_dataset.py out/all_chains.json out/gold_chains.json out/dataset python src/extract_segments.py out/eval_segments.dict --path_game_logs data/logs/test_logs.dict --stopwords --meteor --from_first_common --utterances_as_captions python src/eval_chains.py out/eval_segments.dict
-
To alternatively use CLIPScore as part of the scoring in extraction, just add the
--clipscore
option when runningextract_segments.py
above.