PL2-Transformer

🔥Official implement of "Point Long-Term Locality-Aware Transformer for Point Cloud Video Understanding" (Submitted to IEEE Transactions on Circuits and Systems for Video Technology (TCSVT))

Abstract

🔥Point cloud videos have been widely used in real-world applications to understand 3D dynamic objects and scenes. However, there still exist significant challenges in effectively embedding the inter-frame motion. Another crucial challenge lies in the failure to consider the long-term dependencies within local regions, which is an important factor for the efficacy of the neural model yet largely under-explored. In this paper, we propose an effective Point Long-term Locality-aware Transformer network to meet these challenges, termed as PL2-Transformer. First, the Point 4D Convolution (4DConv) is harnessed as the 4D backbone to aggregate the short-term spatial-temporal local information. Second, to enhance motion dynamics understanding, we introduce an inter-frame motion embedding, which captures the motion between frames and provides reliable motion cues for the subsequent Transformer network. Finally, we propose an effective Long-Term Locality-Aware Transformer (LLT), which utilizes a novel Long-Term Locality-Aware Attention (LLA) mechanism to capture long-term dependencies within local regions across the entire Point cloud video. Extensive experiments on multiple benchmarks demonstrate the effectiveness of our approach, surpassing the current state-of-the-art (SOTA) methods, or being comparable to the current SOTA methods while having fewer parameters. Source codes will be made publicly available.

Installation

The code is tested with Red Hat Enterprise Linux Workstation release 7.7 (Maipo), g++ (GCC) 8.3.1, PyTorch v1.8.1, CUDA 10.2 and cuDNN v7.6.
Device : 2 × RTX 2080Ti (22G)
Compile the CUDA layers for PointNet++, which we used for furthest point sampling (FPS) and radius neighbouring search:

cd modules
python setup.py install

Core codes will coming soon.

Some core codes will coming soon after paper accepted (NTU Syn NVg).

Datasets

🌱The MSR dataset encompasses 20 actions, a total of 23K frames. Thanks to this author Meteornet for providing us with data preprocessing code. (about 800M)
🌱The NTU RGB+D 60 dataset encompasses 60 actions, a total of 4M frames. Thanks to this author PSTNet for providing us with data preprocessing code. (about 800G)
🌱The Synthia 4D dataset. Synthia 4D is a synthetic dataset for outdoor autonomous driving. Thanks to this author P4transformer for providing us with data preprocessing code. (about 5G)
🌱The NVgesture dataset. Thanks to this author MaST-Pre for providing us with data preprocessing code. (about 10G)

Train

🤗Lets train the model!

Train medium model

python train-msr-meduim.py

Train full model

python train-msr-full.py

Log

📢The log has been uploaded!

PIPELINE

Inter-Frame Motion Visualization

Visualization of 4D Semantic Segmentation

Related Repos

PointNet++ PyTorch implementation: https://github.com/facebookresearch/votenet/tree/master/pointnet2
Transformer: https://github.com/lucidrains/vit-pytorch
P4Transformer: https://github.com/hehefan/P4Transformer
PST-Transformer:https://github.com/hehefan/PST-Transformer

Acknowledgement

💡We thank the authors of P4transformer and PST-Transformer for their interesting work.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Img		Img
Log		Log
datasets		datasets
model		model
modules		modules
Pipeline.png		Pipeline.png
README.md		README.md
experiments_synthia_visualizationV3.jpg		experiments_synthia_visualizationV3.jpg
scheduler.py		scheduler.py
train-msr-full.py		train-msr-full.py
train-msr-meduim.py		train-msr-meduim.py
train-ntu.py		train-ntu.py
train-nvgesture.py		train-nvgesture.py
train-syn.py		train-syn.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PL2-Transformer

Abstract

Installation

Core codes will coming soon.

Datasets

Train

Train medium model

Train full model

Log

PIPELINE

Inter-Frame Motion Visualization

Visualization of 4D Semantic Segmentation

Related Repos

Acknowledgement

About

Releases

Packages

Languages

I2-Multimedia-Lab/PL2-Transformer

Folders and files

Latest commit

History

Repository files navigation

PL2-Transformer

Abstract

Installation

Core codes will coming soon.

Datasets

Train

Train medium model

Train full model

Log

PIPELINE

Inter-Frame Motion Visualization

Visualization of 4D Semantic Segmentation

Related Repos

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages