Skip to content

Latest commit

 

History

History
71 lines (48 loc) · 2.96 KB

ImageCaption.md

File metadata and controls

71 lines (48 loc) · 2.96 KB

Recognition to Cognition Networks https://visualcommonsense.com https://github.com/rowanz/r2c

CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present https://github.com/chenxinpeng/ARNet

Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training https://github.com/bei21/img2poem

Dataset and starting code for visual entailment dataset https://arxiv.org/abs/1811.10582 https://github.com/necla-ml/SNLI-VE

Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction https://github.com/shikorab/SceneGraph

Tensorflow implementation of "A Structured Self-Attentive Sentence Embedding" https://github.com/flrngel/Self-Attentive-tensorflow

This repository contains the reference code for the paper Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions (CVPR 2019). https://github.com/aimagelab/show-control-and-tell

The Code for ICME2019 Grand Challenge: Short Video Understanding (Single Model Ranks 6th) https://github.com/guoday/ICME2019-CTR

【基于Transformer的图像自动描述PyTorch/Fairseq扩展】 https://github.com/krasserm/fairseq-image-captioning

Code for Neural Inverse Knitting: From Images to Manufacturing Instructions https://github.com/xionluhnis/neural_inverse_knitting

Code for paper "Attention on Attention for Image Captioning". ICCV 2019 https://arxiv.org/abs/1908.06954 https://github.com/husthuaan/AoANet

Learning to Evaluate Image Captioning. CVPR 2018 https://github.com/richardaecn/cvpr18-caption-eval

Vision-Language Pre-training for Image Captioning and Question Answering https://github.com/LuoweiZhou/VLP

Official Tensorflow Implementation of the paper "Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning" in CVPR 2018, with code, model and prediction results. https://github.com/JaywongWang/DenseVideoCaptioning

A PyTorch implementation of Transformer in "Attention is All You Need" https://arxiv.org/abs/1706.03762 https://github.com/dreamgonfly/Transformer-pytorch

《ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data》 https://www.arxiv-vanity.com/papers/2001.07966/

【结合BERT的图片描述生成】’Image Captioning System - BERT + Image Captioning' https://github.com/ajamjoom/Image-Captions

M^2: Meshed-Memory Transformer for Image Captioning https://github.com/aimagelab/meshed-memory-transformer

Video Grounding and Captioning

https://github.com/facebookresearch/grounded-video-description

Reformer:高效的Transformer https://github.com/google/trax/tree/master/trax/models/reformer

ICCV研讨会的中英文视频描述大赛 http://vatex.org/main/index.html

Cooperative Vision-and-Dialog Navigation https://github.com/mmurray/cvdn

Auto-Encoding Scene Graphs for Image Captioning, CVPR 2019 https://github.com/yangxuntu/SGAE

A PyTorch implementation of the Transformer model from "Attention Is All You Need". https://github.com/phohenecker/pytorch-transformer