PointPillars: Fast Encoders for Object Detection from Point Clouds

Introduction

We implement PointPillars and provide the results and checkpoints on KITTI, nuScenes, Lyft and Waymo datasets.

@inproceedings{lang2019pointpillars,
  title={Pointpillars: Fast encoders for object detection from point clouds},
  author={Lang, Alex H and Vora, Sourabh and Caesar, Holger and Zhou, Lubing and Yang, Jiong and Beijbom, Oscar},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={12697--12705},
  year={2019}
}

Results

KITTI

Backbone	Class	Lr schd	Mem (GB)	Inf time (fps)	AP	Download
SECFPN	Car	cyclic 160e	5.4		77.1	model \| log
SECFPN	3 Class	cyclic 160e	5.5		59.5	model \| log

nuScenes

Backbone	Lr schd	Mem (GB)	Inf time (fps)	mAP	NDS	Download
SECFPN	2x	16.4		35.17	49.7	model \| log
FPN	2x	16.4		40.0	53.3	model \| log

Lyft

Backbone	Lr schd	Mem (GB)	Inf time (fps)	Private Score	Public Score	Download
SECFPN	2x	12.2		13.9	14.1	model \| log
FPN	2x	9.2		14.9	15.1	model \| log

Waymo

Backbone	Load Interval	Class	Lr schd	Mem (GB)	mAP@L1	mAPH@L1	mAP@L2	mAPH@L2	Download
SECFPN	5	Car	2x	7.76	70.2	69.6	62.6	62.1	model \| log
SECFPN	5	3 Class	2x	8.12	64.7	57.6	58.4	52.1	model \| log
above @ Car			2x	8.12	68.5	67.9	60.1	59.6
above @ Pedestrian			2x	8.12	67.8	50.6	59.6	44.3
above @ Cyclist			2x	8.12	57.7	54.4	55.5	52.4
SECFPN	1	Car	2x	7.76	72.1	71.5	63.6	63.1	log
SECFPN	1	3 Class	2x	8.12	68.8	63.3	62.6	57.6	log
above @ Car			2x	8.12	71.6	71.0	63.1	62.5
above @ Pedestrian			2x	8.12	70.6	56.7	62.9	50.2
above @ Cyclist			2x	8.12	64.4	62.3	61.9	59.9

Note:

Metric: For model trained with 3 classes, the average APH@L2 (mAPH@L2) of all the categories is reported and used to rank the model. For model trained with only 1 class, the APH@L2 is reported and used to rank the model.
Data Split: Here we provide several baselines for waymo dataset, among which D5 means that we divide the dataset into 5 folds and only use one fold for efficient experiments. Using the complete dataset can boost the performance a lot, especially for the detection of cyclist and pedestrian, where more than 5 mAP or mAPH improvement can be expected.
Implementation Details: We basically follow the implementation in the paper in terms of the network architecture (having a stride of 1 for the first convolutional block). Different settings of voxelization, data augmentation and hyper parameters make these baselines outperform those in the paper by about 7 mAP for car and 4 mAP for pedestrian with only a subset of the whole dataset. All of these results are achieved without bells-and-whistles, e.g. ensemble, multi-scale training and test augmentation.
License Aggrement: To comply the license agreement of Waymo dataset, the pre-trained models on Waymo dataset are not released. We still release the training log as a reference to ease the future research.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PointPillars: Fast Encoders for Object Detection from Point Clouds

Introduction

Results

KITTI

nuScenes

Lyft

Waymo

Note:

Files

README.md

Latest commit

History

README.md

File metadata and controls

PointPillars: Fast Encoders for Object Detection from Point Clouds

Introduction

Results

KITTI

nuScenes

Lyft

Waymo

Note: