Advances in Visual Dialog Last update on 2022/10/16.
-
Visual Dialog, CVPR 2017, [code]
-
Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model, NIPS 2017, [code]
-
Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning, CVPR 2018
-
Image-Question-Answer Synergistic Network for Visual Dialog, CVPR 2019
-
Reasoning Visual Dialogs with Structural and Partial Observations, CVPR, 2019, [code]
-
Recursive Visual Attention in Visual Dialog, CVPR 2019, [code]
-
Dual Visual Attention Network for Visual Dialog, IJCAI 2019
-
Making History Matter: History-Advantage Sequence Training for Visual Dialog, ICCV 2019
-
Granular Multimodal Attention Networks for Visual Dialog, ICCV Workshop 2019
-
Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog, ACL 2019
-
Dual Attention Networks for Visual Reference Resolution in Visual Dialog, EMNLP 20219, []code
-
DMRM: A Dual-channel Multi-hop Reasoning Model for Visual Dialog, AAAI 2020, [code]
-
Modality-Balanced Models for Visual Dialogue, AAAI 2020
-
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue, AAAI 2020, [code]
-
Two Causal Principles for Improving Visual Dialog, CVPR 2020, [code]
-
DAM: Deliberation, Abandon and Memory Networks for Generating Detailed and Non-repetitive Responses in Visual Dialogue, IJCAI 2020, [code]
-
KBGN: Knowledge-Bridge Graph Network for Adaptive Vision-Text Reasoning in Visual Dialogue, ACM MM 2020
-
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline, ECCV 2020, [code]
-
Visual Dialog: Light-weight Transformer for Many Inputs, ECCV 2020, [code]
-
Multi-View Attention Network for Visual Dialog, ACL 2020, [code]
-
History for Visual Dialog: Do we really need it?, ACL 2020, [code]
-
VD-BERT: A Unified Vision and Dialog Transformer with BERT, EMNLP 2020, [code]
-
GoG: Graph-over-Graph Network for Visual Dialog, ACL Findings 2021
-
Multimodal Incremental Transformer for Visual Dialogue Generation, ACL Findings 2021
-
Learning to Ground Visual Objects for Visual Dialog, EMNLP Findings 2021
-
VU-BERT: A Unified framework for Visual Dialog, ICASSP 2022
-
Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning, ICASSP 2022
-
UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog, CVPR 2022
-
VD-PCR: Improving visual dialog with pronoun coreference resolution, Pattern Recognition 2022, [code]
-
Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog, ACM MM 2022
-
Unified Multimodal Model with Unlikelihood Training for Visual Dialog, ACM MM 2022, [code]
Engaging Image Chat: Modeling Personality in Grounded Dialogue, ACL 2020, [code]
PhotoChat: A Human-Human Dialogue Dataset with Photo Sharing Behavior for Joint Image-Text Modeling, ACL 2021, [code]
MMChat: Multi-Modal Chat Dataset on Social Media Yinhe, LREC 2022, [code]
Bridging Text and Video: A Universal Multimodal Transformer for Video-Audio Scene-Aware Dialog, AAAI 2020, [code]