Skip to content

GoodarzMehr/bevfusion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

98 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PWC

BEVFusion in CARLA

demo

About

This is a fork of BEVFusion that can be trained and evaluated on data from the CARLA Simulator generated by SimBEV.

Installation

Without Docker

BEVFusion requires the following libraries:

After installing these dependencies, run

python setup.py develop

to install the codebase.

With Docker

  1. Install Docker on your system.
  2. Install the Nvidia Container Toolkit. It exposes your Nvidia graphics card to Docker containers.
  3. Install the Nvidia Container Runtime and set it as the default runtime.
  4. In the docker folder, run
docker build --no-cache --rm -t bevfusion:develop .

You may need to replace libnvidia-gl-550 and libnvidia-common-550 packages in the Dockerfile with ones that are compatible with your Nvidia driver version.

The following build arguments (ARG) are available:

  • USER: username inside each container, set to bf by default.

Launch a container by running

docker run --privileged --gpus all --network=host -e DISPLAY=$DISPLAY
-v [path/to/BEVFusion]:/home/bevfusion
-v [path/to/dataset]:/dataset
--shm-size 32g -it bevfusion:develop /bin/bash

Then, in /home/bevfusion, run

python setup.py develop

to install the codebase.

Usage

Training

For camera-only 3D object detection, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/det/centerhead/lssfpn/camera/256x704/swint/default.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth

For camera-only BEV segmentation, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/seg/camera-bev256d2.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth

For lidar-only 3D object detection, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/det/transfusion/secfpn/lidar/voxelnet_0p075.yaml

For lidar-only BEV segmentation, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/seg/lidar-centerpoint-bev128.yaml

For BEVFusion 3D object detection, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/det/transfusion/secfpn/camera+lidar/swint_v0p075/convfuser.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth --load_from pretrained/simbev-lidar-only-det.pth 

For BEVFusion BEV segmentation, run

torchpack dist-run -np 8 python tools/train.py configs/simbev/seg/fusion-bev256d2-lss.yaml --model.encoders.camera.backbone.init_cfg.checkpoint pretrained/swint-nuimages-pretrained.pth

Replace 8 in -np 8 with the number of your GPUs. You can change samples_per_gpu and workers_per_gpu values in configs/simbev/default.yaml based on your available GPU memory and number of CPU cores.

If you want to evaluate on/visualize the test set instead of the val set, change data.test.ann_file in configs/simbev/default.yaml to ${dataset_root + "infos/simbev_infos_test.json"}.

Evaluation

For evaluation, run

torchpack dist-run -np [number of gpus] python tools/test.py [config file path] [checkpoint name] --eval [evaluation type]

For example, to evaluate the camera-only 3D object detection model, run

torchpack dist-run -np 8 python tools/test.py configs/simbev/det/centerhead/lssfpn/camera/256x704/swint/default.yaml pretrained/simbev-camera-only-det.pth --eval bbox

Or, to evaluate the BEVFusion BEV segmentation model, run

torchpack dist-run -np 8 python tools/test.py configs/simbev/seg/fusion-bev256d2-lss.yaml pretrained/simbev-bevfusion-seg.pth --eval map

Visualization

For visualization, run

torchpack dist-run -np 8 python tools/visualize.py [config file path] --mode [mode] --checkpoint [checkpoint name] --split [data split] --out-dir [output directory path]

mode can be gt-simbev (to visualize the ground truth) or pred-simbev (to visualize model predictions). For example, to visualize the lidar-only 3D object detection model predictions, run

torchpack dist-run -np 8 python tools/visualize.py configs/simbev/det/transfusion/secfpn/lidar/voxelnet_0p075.yaml --mode pred-simbev --checkpoint pretrained/simbev-lidar-only-det.pth --split test --bbox-score 0.1 --out-dir 'viz/lidar-only-det'

Or, to visualize the BEVFusion BEV segmentation model predictions, run

torchpack dist-run -np 8 python tools/visualize.py configs/simbev/seg/fusion-bev256d2-lss.yaml --mode pred-simbev --checkpoint pretrained/simbev-bevfusion-seg.pth --split test --map-score 0.5 --out-dir 'viz/bevfusion-seg'

Results

3D Object Detection

Camera-only

Class AP (%) ATE (m) AOE (rad) ASE AVE (m/s)
Car 23.3 0.824 0.896 0.217 4.95
Truck 20.4 0.751 0.695 0.148 5.55
Bus 18.7 0.829 1.185 0.022 5.54
Motorcycle 26.5 0.604 0.841 0.140 6.64
Bicycle 25.1 0.574 1.117 0.219 4.12
Pedestrian 18.9 0.883 1.529 0.073 1.10
mean 22.1 0.744 1.044 0.137 4.65

SDS: 25.1% / Checkpoint

Lidar-only

Class AP (%) ATE (m) AOE (rad) ASE AVE (m/s)
Car 46.1 0.165 0.109 0.127 1.44
Truck 46.3 0.162 0.045 0.110 1.75
Bus 34.1 0.169 0.049 0.072 2.40
Motorcycle 51.9 0.114 0.118 0.159 1.79
Bicycle 55.5 0.115 0.087 0.213 1.53
Pedestrian 54.5 0.141 0.392 0.120 0.47
mean 48.1 0.144 0.133 0.134 1.56

SDS: 56.4% / Checkpoint

Lidar-Camera

Class AP (%) ATE (m) AOE (rad) ASE AVE (m/s)
Car 46.5 0.162 0.106 0.125 1.40
Truck 46.2 0.168 0.049 0.106 1.74
Bus 34.3 0.176 0.040 0.063 2.44
Motorcycle 51.6 0.113 0.105 0.153 1.65
Bicycle 55.3 0.115 0.073 0.207 1.52
Pedestrian 54.8 0.141 0.362 0.109 0.47
mean 48.1 0.146 0.122 0.127 1.54

SDS: 56.6% / Checkpoint

BEV Segmentation

Results are provided for different IoU thresholds.

Camera-only

Class 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
road 59.5 67.1 71.5 74.5 76.0 75.2 72.6 68.9 62.3
car 3.5 8.0 18.8 22.4 17.2 11.3 9.7 8.7 6.3
truck 2.1 6.7 11.7 9.8 5.1 2.1 0.4 0.0 0.0
bus 2.1 9.0 19.9 24.6 22.9 16.8 10.3 6.0 1.1
motorcycle 0.3 0.6 0.0 0.0 0.0 0.0 0.0 0.0 0.0
bicycle 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
rider 0.3 0.1 0.0 0.0 0.0 0.0 0.0 0.0 0.0
pedestrian 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
mIoU 8.5 11.4 15.2 16.4 15.2 13.2 11.6 10.5 8.7

Checkpoint

Lidar-only

IoU 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
road 48.6 55.9 66.1 85.1 87.7 87.2 84.9 81.1 74.6
car 5.3 37.8 56.5 67.1 70.6 63.6 51.8 37.6 18.8
truck 11.4 44.7 61.2 70.6 73.5 67.4 55.2 39.5 16.3
bus 19.1 56.9 72.0 79.7 81.5 78.1 69.7 59.4 44.1
motorcycle 4.8 13.7 23.6 32.7 32.5 15.8 0.8 0.0 0.0
bicycle 1.8 5.1 10.0 13.3 3.6 0.0 0.0 0.0 0.0
rider 4.6 11.7 20.8 30.4 18.4 0.5 0.0 0.0 0.0
pedestrian 3.1 9.6 17.6 28.4 18.9 0.1 0.0 0.0 0.0
mIoU 12.4 29.4 41.0 50.9 48.3 39.1 32.8 27.2 19.2

Checkpoint

Camera-Lidar

IoU 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
road 59.7 72.0 80.0 85.5 88.4 88.1 85.9 82.4 76.3
car 11.7 39.4 58.6 69.4 72.7 65.5 54.0 40.1 20.5
truck 12.3 47.0 61.4 70.9 74.5 69.2 57.6 43.2 20.6
bus 19.7 56.8 70.3 78.2 80.0 77.2 68.9 59.0 44.1
motorcycle 5.1 13.5 23.6 34.6 36.3 18.3 1.5 0.0 0.0
bicycle 1.9 5.6 11.1 12.7 3.6 0.0 0.0 0.0 0.0
rider 4.8 11.9 21.0 31.0 23.3 1.2 0.0 0.0 0.0
pedestrian 3.1 9.9 18.3 28.7 20.2 0.2 0.0 0.0 0.0
mIoU 14.8 32.0 43.0 51.3 50.0 40.0 33.5 28.1 20.2

Checkpoint

About

Implementation of BEVFusion [ICRA 2023] using the SimBEV dataset.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages

  • Python 70.6%
  • C++ 17.2%
  • Cuda 11.8%
  • Other 0.4%