Skip to content

Latest commit

 

History

History
64 lines (48 loc) · 7.48 KB

README.md

File metadata and controls

64 lines (48 loc) · 7.48 KB

PointPillars: Fast Encoders for Object Detection from Point Clouds

Introduction

We implement PointPillars and provide the results and checkpoints on KITTI, nuScenes, Lyft and Waymo datasets.

@inproceedings{lang2019pointpillars,
  title={Pointpillars: Fast encoders for object detection from point clouds},
  author={Lang, Alex H and Vora, Sourabh and Caesar, Holger and Zhou, Lubing and Yang, Jiong and Beijbom, Oscar},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={12697--12705},
  year={2019}
}

Results

KITTI

Backbone Class Lr schd Mem (GB) Inf time (fps) AP Download
SECFPN Car cyclic 160e 5.4 77.1 model | log
SECFPN 3 Class cyclic 160e 5.5 59.5 model | log

nuScenes

Backbone Lr schd Mem (GB) Inf time (fps) mAP NDS Download
SECFPN 2x 16.4 35.17 49.7 model | log
FPN 2x 16.4 40.0 53.3 model | log

Lyft

Backbone Lr schd Mem (GB) Inf time (fps) Private Score Public Score Download
SECFPN 2x 12.2 13.9 14.1 model | log
FPN 2x 9.2 14.9 15.1 model | log

Waymo

Backbone Load Interval Class Lr schd Mem (GB) Inf time (fps) mAP@L1 mAPH@L1 mAP@L2 mAPH@L2 Download
SECFPN 5 Car 2x 7.76 70.2 69.6 62.6 62.1 model | log
SECFPN 5 3 Class 2x 8.12 64.7 57.6 58.4 52.1 model | log
above @ Car 2x 8.12 68.5 67.9 60.1 59.6
above @ Pedestrian 2x 8.12 67.8 50.6 59.6 44.3
above @ Cyclist 2x 8.12 57.7 54.4 55.5 52.4
SECFPN 1 Car 2x 7.76 72.1 71.5 63.6 63.1 log
SECFPN 1 3 Class 2x 8.12 68.8 63.3 62.6 57.6 log
above @ Car 2x 8.12 71.6 71.0 63.1 62.5
above @ Pedestrian 2x 8.12 70.6 56.7 62.9 50.2
above @ Cyclist 2x 8.12 64.4 62.3 61.9 59.9

Note:

  • Metric: For model trained with 3 classes, the average APH@L2 (mAPH@L2) of all the categories is reported and used to rank the model. For model trained with only 1 class, the APH@L2 is reported and used to rank the model.
  • Data Split: Here we provide several baselines for waymo dataset, among which D5 means that we divide the dataset into 5 folds and only use one fold for efficient experiments. Using the complete dataset can boost the performance a lot, especially for the detection of cyclist and pedestrian, where more than 5 mAP or mAPH improvement can be expected.
  • Implementation Details: We basically follow the implementation in the paper in terms of the network architecture (having a stride of 1 for the first convolutional block). Different settings of voxelization, data augmentation and hyper parameters make these baselines outperform those in the paper by about 7 mAP for car and 4 mAP for pedestrian with only a subset of the whole dataset. All of these results are achieved without bells-and-whistles, e.g. ensemble, multi-scale training and test augmentation.
  • License Aggrement: To comply the license agreement of Waymo dataset, the pre-trained models on Waymo dataset are not released. We still release the training log as a reference to ease the future research.