Geometric Transform Attention

Takeru Miyato · Bernhard Jaeger · Max Welling · Andreas Geiger

OpenReview | arXiv | Project Page

Official reproducing code of our ICLR2024 work: "GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers", a simple way to make your multi-view transformer more expressive!

(3/15/2024): The GTA mechanism is also effective for image generation, which is a purely 2D task. You can find the experimental details in our camera-ready paper and the implementation at this branch.

Setup

1. Create env and install python libraries

conda create -n gta python=3.9
conda activate gta
pip3 install -r requirements.txt

2. Download dataset

export DATADIR=<path_to_datadir>
mkdir -p $DATADIR

CLEVR-TR

Download the dataset from this link and place it under $DATADIR

MultiShapeNet Hard (MSN-Hard)

gsutil -m cp -r gs://kubric-public/tfds/kubric_frames/multi_shapenet_conditional/2.8.0/ ${DATADIR}/multi_shapenet_frames/

*Pretrained models (MSN-Hard pre-trained models will be uploaded soon)

CLEVR-TR: link
MSN-Hard: link

Training

CLEVR-TR

torchrun --standalone --nnodes 1 --nproc_per_node 4 train.py runs/clevrtr/GTA/gta/config.yaml  ${DATADIR}/clevrtr --seed=0

MSN-Hard

torchrun --standalone --nnodes 1 --nproc_per_node 4 train.py runs/msn/GTA/gta_so3/config.yaml  ${DATADIR} --seed=0

Evaluation of PSNR, SSIM and LPIPS

python evaluate.py runs/clevrtr/GTA/gta/config.yaml ${DATADIR}/clevrtr $PATH_TO_CHECKPOINT # CLEVR-TR
python evaluate.py runs/msn/GTA/gta_so3/config.yaml ${DATADIR} $PATH_TO_CHECKPOINT # MSN-Hard

Acknowledgements

This repository is built on top of SRT and OSRT created by @stelzner. We would like to thank him for his open-source contribution of the SRT models. We also thank @lucidrains for providing the values of J matrices, which are needed to compute the irreps of SO(3) efficiently.

Citation

@inproceedings{Miyato2024GTA,
    title={GTA: A Geometry-Aware Attention Mechanism for Multi-View Transformers},
    author={Miyato,Takeru and Jaeger, Bernhard and Welling, Max and Geiger, Andreas},
    booktitle={International Conference on Learning Representations (ICLR)},
    year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Geometric Transform Attention

OpenReview | arXiv | Project Page

Contents

Setup

1. Create env and install python libraries

2. Download dataset

CLEVR-TR

MultiShapeNet Hard (MSN-Hard)

*Pretrained models (MSN-Hard pre-trained models will be uploaded soon)

Training

CLEVR-TR

MSN-Hard

Evaluation of PSNR, SSIM and LPIPS

Acknowledgements

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Geometric Transform Attention

OpenReview | arXiv | Project Page

Contents

Setup

1. Create env and install python libraries

2. Download dataset

CLEVR-TR

MultiShapeNet Hard (MSN-Hard)

*Pretrained models (MSN-Hard pre-trained models will be uploaded soon)

Training

CLEVR-TR

MSN-Hard

Evaluation of PSNR, SSIM and LPIPS

Acknowledgements

Citation