textvqa

Here are 4 public repositories matching this topic...

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

deep-learning dialog pytorch vqa pretrained-models captioning multimodal multi-tasking textvqa hateful-memes

Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.

language vision eccv textvqa

PyTorch DataLoader for many VQA datasets

pytorch vqa dataloader gqa textvqa vqav2

[PRL 2024] This is the code repo for our label-free pruning and retraining technique for autoregressive Text-VQA Transformers (TAP, TAP†).

transformer textvqa pruning-algorithms

Add a description, image, and links to the textvqa topic page so that developers can more easily learn about it.

To associate your repository with the textvqa topic, visit your repo's landing page and select "manage topics."