Performance benchmarking for NVIDIA-accelerated Isaac ROS packages.
Isaac ROS Benchmark builds upon the ros2_benchmark to provide configurations to benchmark Isaac ROS graphs. Performance results that measure Isaac ROS for throughput, latency, and utilization enable robotics developers to make informed decisions when designing real-time robotics applications. The Isaac ROS performance results can be independently verified, as the method, configuration, and data input used for benchmarking are provided.
The ros2_benchmark
playback node plug-in, for type adaptation and
negotiation, is provided for
NITROS, which
optimizes the performance of message transport costs through
RCL with GPU accelerated graphs of
nodes.
The datasets for benchmarking are explicitly not downloaded by default. To pull down the standardized benchmark datasets, refer to the ros2_benchmark Dataset section.
Please visit the Isaac ROS Documentation to learn how to use this repository.
Update 2024-12-10: Added new benchmarks
Node |
Input Size |
AGX Orin |
Orin NX |
Orin Nano 8GB |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|
AprilTag Node |
720p |
249 fps 4.5 ms @ 30Hz |
116 fps 9.3 ms @ 30Hz |
80.7 fps 14 ms @ 30Hz |
596 fps 0.97 ms @ 30Hz |
Freespace Segmentation Node |
576p |
2120 fps 1.7 ms @ 30Hz |
2490 fps 1.6 ms @ 30Hz |
1560 fps 2.3 ms @ 30Hz |
3500 fps 0.52 ms @ 30Hz |
Depth Segmentation Node |
576p |
45.8 fps 79 ms @ 30Hz |
28.2 fps 99 ms @ 30Hz |
– |
105 fps 25 ms @ 30Hz |
FoundationPose Pose Estimation Node |
720p |
1.72 fps 690 ms @ 30Hz |
– |
– |
9.61 fps 110 ms @ 30Hz |
DNN Stereo Disparity Node Full |
576p |
103 fps 12 ms @ 30Hz |
42.1 fps 26 ms @ 30Hz |
– |
350 fps 2.3 ms @ 30Hz |
DNN Stereo Disparity Node Light |
288p |
306 fps 5.6 ms @ 30Hz |
143 fps 9.4 ms @ 30Hz |
– |
350 fps 1.6 ms @ 30Hz |
Stereo Disparity Node |
1080p |
117 fps 11 ms @ 30Hz |
78.4 fps 14 ms @ 30Hz |
53.5 fps 21 ms @ 30Hz |
978 fps 1.6 ms @ 30Hz |
Rectify Node |
1080p |
1160 fps 2.2 ms @ 30Hz |
595 fps 3.6 ms @ 30Hz |
411 fps 4.4 ms @ 30Hz |
2500 fps 0.70 ms @ 30Hz |
TensorRT Node DOPE |
VGA |
47.9 fps 24 ms @ 30Hz |
18.1 fps 56 ms @ 30Hz |
13.1 fps 81 ms @ 30Hz |
298 fps 4.6 ms @ 30Hz |
Triton Node DOPE |
VGA |
47.2 fps 24 ms @ 30Hz |
20.3 fps 530 ms @ 30Hz |
14.5 fps 780 ms @ 30Hz |
276 fps 4.6 ms @ 30Hz |
TensorRT Node PeopleSemSegNet |
544p |
460 fps 4.1 ms @ 30Hz |
348 fps 6.1 ms @ 30Hz |
238 fps 7.0 ms @ 30Hz |
– |
Triton Node PeopleSemSegNet |
544p |
304 fps 4.8 ms @ 30Hz |
206 fps 6.5 ms @ 30Hz |
– |
– |
DNN Image Encoder Node |
VGA |
420 fps 12 ms @ 30Hz |
382 fps 11 ms @ 30Hz |
– |
574 fps 5.2 ms @ 30Hz |
Occupancy Grid Localizer Node |
~50 sq. m |
12.9 fps 86 ms @ 30Hz |
8.36 fps 130 ms @ 30Hz |
5.81 fps 190 ms @ 30Hz |
50.1 fps 8.8 ms @ 30Hz |
H.264 Decoder Node |
1080p |
197 fps 8.2 ms @ 30Hz |
– |
– |
596 fps 4.2 ms @ 30Hz |
H.264 Encoder Node I-frame Support |
1080p |
402 fps 13 ms @ 30Hz |
– |
– |
409 fps 3.4 ms @ 30Hz |
H.264 Encoder Node P-frame Support |
1080p |
473 fps 11 ms @ 30Hz |
– |
– |
596 fps 2.1 ms @ 30Hz |
Nvblox Node |
– |
4.87 fps 35.9 ms |
4.95 fps -1.43 ms |
4.95 fps 23.1 ms |
4.95 fps 195 ms |
Graph |
Input Size |
AGX Orin |
Orin NX |
Orin Nano 8GB |
x86_64 w/ RTX 4090 |
---|---|---|---|---|---|
AprilTag Graph |
720p |
246 fps 6.3 ms @ 30Hz |
111 fps 12 ms @ 30Hz |
77.5 fps 20 ms @ 30Hz |
596 fps 1.6 ms @ 30Hz |
Freespace Segmentation Graph |
576p |
36.5 fps 84 ms @ 30Hz |
27.8 fps 100 ms @ 30Hz |
22.0 fps 98 ms @ 30Hz |
100 fps 22 ms @ 30Hz |
Centerpose Pose Estimation Graph |
VGA |
47.1 fps 6.5 ms @ 30Hz |
30.0 fps 50 ms @ 30Hz |
19.9 fps 28 ms @ 30Hz |
50.2 fps 12 ms @ 30Hz |
DOPE Pose Estimation Graph |
VGA |
27.7 fps 16 ms @ 30Hz |
17.8 fps 14 ms @ 30Hz |
– |
187 fps 7.8 ms @ 30Hz |
DNN Stereo Disparity Graph Full |
576p |
33.5 fps 25 ms @ 30Hz |
35.2 fps 34 ms @ 30Hz |
– |
350 fps 5.6 ms @ 30Hz |
DNN Stereo Disparity Graph Light |
288p |
179 fps 14 ms @ 30Hz |
126 fps 15 ms @ 30Hz |
– |
350 fps 4.4 ms @ 30Hz |
Stereo Disparity Graph |
1080p |
111 fps 15 ms @ 30Hz |
72.2 fps 19 ms @ 30Hz |
49.7 fps 26 ms @ 30Hz |
695 fps 3.9 ms @ 30Hz |
DetectNet Object Detection Graph |
544p |
70.5 fps 26 ms @ 30Hz |
30.1 fps 46 ms @ 30Hz |
22.9 fps 57 ms @ 30Hz |
254 fps 11 ms @ 30Hz |
RT-DETR Object Detection Graph SyntheticaDETR |
720p |
56.5 fps 30 ms @ 30Hz |
33.8 fps 39 ms @ 30Hz |
24.1 fps 53 ms @ 30Hz |
490 fps 7.1 ms @ 30Hz |
TensorRT Graph PeopleSemSegNet |
544p |
371 fps 19 ms @ 30Hz |
250 fps 20 ms @ 30Hz |
163 fps 23 ms @ 30Hz |
– |
SAM Image Segmentation Graph Full SAM |
720p |
2.22 fps 390 ms @ 30Hz |
– |
– |
16.4 fps 280 ms @ 30Hz |
SAM Image Segmentation Graph Mobile SAM |
720p |
8.75 fps 570 ms @ 30Hz |
5.34 fps 1400 ms @ 30Hz |
2.22 fps 340 ms @ 30Hz |
68.6 fps 23 ms @ 30Hz |
Live Graph |
Input Size |
Nova Carter |
---|---|---|
Data Recorder Live Graph 4 Hawk Cameras |
1200p |
22.0 fps (per stream avg) 0 dropped frames (avg) |
Multicam Visual SLAM Live Graph 4 Hawk Cameras |
1200p |
30.1 fps |
DNN Stereo Disparity Live Graph 3 Hawk Cameras 1x Full ESS and 2x Throttled Light ESS |
1200p |
Full: 30.2 fps Light: 15.2 fps (avg) |
Perceptor Graph 3 Hawk Cameras |
1200p |
Nvblox ESDF: 9.46 fps Nvblox Mesh: 1.01 fps Visual Odometry: 30.0 fps |