Skip to content

Commit 3458a27

Browse files
wendell-homwhom3jjomiertbirdso
authored
Add stereo_vision app (#661)
* Add stereo_vision app * Lint fix Signed-off-by: Wendell Hom <whom@nvidia.com> * Update stereo_vision app to work with build_and_run command Signed-off-by: Wendell Hom <whom@nvidia.com> * Add --source option to allow v4l2 or replayer source Signed-off-by: Wendell Hom <whom@nvidia.com> * Support VideoStreamReplayer Signed-off-by: Wendell Hom <whom@nvidia.com> * Remove yolov8 object detection Author: Jonathan McLeod * Fix lint error Signed-off-by: Wendell Hom <whom@nvidia.com> * Update gif Signed-off-by: Wendell Hom <whom@nvidia.com> * Apply suggestions from code review Apply suggestions Co-authored-by: Tom Birdsong <40648863+tbirdso@users.noreply.github.com> Signed-off-by: wendell-hom <60016436+wendell-hom@users.noreply.github.com> * Clean up Author: Jonathan McLeod Signed-off-by: Wendell Hom <whom@nvidia.com> * Lint fixes Signed-off-by: Wendell Hom <whom@nvidia.com> --------- Signed-off-by: Wendell Hom <whom@nvidia.com> Signed-off-by: wendell-hom <60016436+wendell-hom@users.noreply.github.com> Co-authored-by: Wendell Hom <whom@nvidia.com> Co-authored-by: Julien Jomier <219040+jjomier@users.noreply.github.com> Co-authored-by: Tom Birdsong <40648863+tbirdso@users.noreply.github.com>
1 parent e637a90 commit 3458a27

23 files changed

+2085
-0
lines changed

applications/CMakeLists.txt

+2
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,8 @@ add_holohub_application(qt_video_replayer DEPENDS OPERATORS qt_video npp_filter)
8282

8383
add_holohub_application(realsense_visualizer DEPENDS OPERATORS realsense_camera)
8484

85+
add_holohub_application(stereo_vision)
86+
8587
add_holohub_application(tao_peoplenet)
8688

8789
add_holohub_application(network_radar_pipeline DEPENDS
+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
cmake_minimum_required(VERSION 3.20)
17+
project(stereo_vision_app)
18+
add_subdirectory(cpp)

applications/stereo_vision/README.md

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Stereo Vision
2+
3+
<p align="center">
4+
<img src="./images/plants.gif" alt="Holoscan Stereo Vision">
5+
</p>
6+
7+
## Overview
8+
9+
A demo pipeline showcasing stereo disparity estimation.
10+
11+
## Description
12+
13+
This pipeline takes video from a stereo camera and estimates disparity using DNN ESS. The disparity map is displayed through Holoviz.
14+
15+
## Requirements
16+
17+
This application requires a V4L2 stereo camera or recorded stereo video as input. A video acquired from a StereoLabs ZED
18+
camera is downloaded when running the `get_data_and_models.sh` script when building the application.
19+
A script for obtaining the calibration for StereoLabs cameras is also provided.
20+
Holoscan SDK >=2.0,<=2.5 is required for TensorRT 8.6 compatibility.
21+
### Camera Calibration
22+
23+
The default calibration will work for the sample video. If using a stereolabs camera the calibration
24+
can be retrieved using `get_zed_calibration.py` and the devices serial number.
25+
26+
```sh
27+
python3 get_zed_calibration.py -s [Serial Number]
28+
```
29+
30+
### Input video
31+
32+
For the input video stream, either use a v4l2 stereo camera such as those produced by stereolabs or included recorded video.
33+
The `stereo-plants.mp4` video is provided [here](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara-holoscan/resources/holoscan_stereo_video) and will be downloaded and converted to the necessary format when building the application.
34+
35+
The source device in `stereo_vision.yaml` should be modified to match the device the v4l2 video is
36+
using. This can be found using `v4l2-ctl --list-devices`.
37+
38+
39+
## Models
40+
41+
This demo requires the ESS DNN Stereo Disparity available from the NGC catalog for disparity estimation. This model is downloaded when you build the application.
42+
43+
### ESS DNN
44+
45+
The ESS engine files generated in this demo application is specific to TRT8.6; make sure
46+
you build the devcontainer with a compatible `base_img` as shown in the <b>Build and Run Instructions</b> section.
47+
48+
## Build and Run Instructions
49+
50+
Run the following command to build and run application using the recorded video:
51+
```sh
52+
./dev_container build_and_run stereo_vision --base_img nvcr.io/nvidia/clara-holoscan/holoscan:v2.4.0-dgpu
53+
```
54+
55+
To run the application using a v4l2 compatible stereo camera, run:
56+
```sh
57+
./dev_container build_and_run stereo_vision --base_img nvcr.io/nvidia/clara-holoscan/holoscan:v2.4.0-dgpu --run_args "--source v4l2"
58+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
17+
cmake_minimum_required(VERSION 3.20)
18+
project(stereo_depth CXX CUDA)
19+
20+
find_package(holoscan 2.4 REQUIRED CONFIG
21+
PATHS "/opt/nvidia/holoscan" "/workspace/holoscan-sdk/install")
22+
23+
24+
include(FetchContent)
25+
FetchContent_Declare(
26+
Eigen3
27+
URL https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz
28+
)
29+
FetchContent_MakeAvailable(Eigen3)
30+
31+
add_executable(stereo_depth
32+
main.cpp
33+
undistort_rectify.cpp
34+
split_video.cpp
35+
heat_map.cpp
36+
stereo_depth_kernels.cu
37+
crop.cpp
38+
ess_processor.cpp
39+
)
40+
target_link_libraries(stereo_depth
41+
PRIVATE
42+
holoscan::core
43+
holoscan::ops::video_stream_replayer
44+
holoscan::ops::holoviz
45+
holoscan::ops::v4l2
46+
holoscan::ops::format_converter
47+
holoscan::ops::inference
48+
holoscan::ops::inference_processor
49+
CUDA::nppif
50+
CUDA::nppidei
51+
CUDA::nppicc
52+
CUDA::nppial
53+
Eigen3::Eigen
54+
)
55+
56+
# Download the stereo vision sample video
57+
if(HOLOHUB_DOWNLOAD_DATASETS)
58+
include(holoscan_download_data)
59+
holoscan_download_data(stereo_vision
60+
URL nvidia/clara-holoscan/holoscan_stereo_video:20241216
61+
DOWNLOAD_NAME holoscan_stereo_vision_20241216.zip
62+
DOWNLOAD_DIR ${HOLOHUB_DATA_DIR}
63+
GENERATE_GXF_ENTITIES
64+
GXF_ENTITIES_HEIGHT 1080
65+
GXF_ENTITIES_WIDTH 3840
66+
GXF_ENTITIES_CHANNELS 3
67+
GXF_ENTITIES_FRAMERATE 30
68+
ALL
69+
)
70+
endif()
71+
72+
# Copy config file
73+
add_custom_target(stereo_depth_yaml
74+
COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_CURRENT_SOURCE_DIR}/stereo_vision.yaml" ${CMAKE_CURRENT_BINARY_DIR}
75+
DEPENDS "stereo_vision.yaml"
76+
BYPRODUCTS "stereo_vision.yaml"
77+
)
78+
79+
# This command should run after stereo_vision_data which removes existing files
80+
add_custom_command(
81+
OUTPUT "${HOLOHUB_DATA_DIR}/stereo_vision/ess.engine"
82+
COMMAND bash "${CMAKE_CURRENT_SOURCE_DIR}/../scripts/get_data_and_models.sh" "${HOLOHUB_DATA_DIR}/stereo_vision"
83+
BYPRODUCTS "${HOLOHUB_DATA_DIR}/stereo_vision/ess.engine"
84+
DEPENDS stereo_vision_data
85+
)
86+
87+
add_custom_target(get_data_and_models ALL
88+
DEPENDS
89+
"${HOLOHUB_DATA_DIR}/stereo_vision/ess.engine"
90+
)
91+
92+
add_dependencies(stereo_depth stereo_depth_yaml get_data_and_models)
+89
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
/*
2+
* SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
* SPDX-License-Identifier: Apache-2.0
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
#include "crop.h"
19+
#include <gxf/std/tensor.hpp>
20+
21+
namespace holoscan::ops {
22+
23+
void CropOp::setup(OperatorSpec& spec) {
24+
spec.input<holoscan::gxf::Entity>("input");
25+
spec.output<holoscan::gxf::Entity>("output");
26+
spec.param(x_, "x", "top left x", "top left x coordinate", 0);
27+
spec.param(y_, "y", "top left y", "top left y coordinate", 0);
28+
spec.param(width_, "width", "width", "width", 0);
29+
spec.param(height_, "height", "height", "height", 0);
30+
}
31+
32+
void CropOp::compute(InputContext& op_input, OutputContext& op_output, ExecutionContext& context) {
33+
auto maybe_tensormap = op_input.receive<holoscan::TensorMap>("input");
34+
const auto tensormap = maybe_tensormap.value();
35+
36+
if (tensormap.size() != 1) { throw std::runtime_error("Expecting single tensor input"); }
37+
38+
auto tensor = tensormap.begin()->second;
39+
int orig_height = tensor->shape()[0];
40+
int orig_width = tensor->shape()[1];
41+
int nChannels = tensor->shape()[2];
42+
43+
nvidia::gxf::Tensor tensor_gxf(tensor->dl_ctx());
44+
nvidia::gxf::PrimitiveType data_type = tensor_gxf.element_type();
45+
int element_size = nvidia::gxf::PrimitiveTypeSize(data_type);
46+
47+
if (x_ < 0 || y_ < 0 || width_ <= 0 || height_ <= 0) {
48+
throw std::runtime_error("Invalid crop dimensions");
49+
}
50+
51+
if ((x_ + width_) > orig_width || (y_ + height_) > orig_height) {
52+
throw std::runtime_error("Crop exceeds image boundaries");
53+
}
54+
55+
auto pointer = std::shared_ptr<void*>(new void*, [](void** pointer) {
56+
if (pointer != nullptr) {
57+
if (*pointer != nullptr) { cudaFree(*pointer); }
58+
delete pointer;
59+
}
60+
});
61+
cudaMalloc(pointer.get(), width_ * height_ * element_size * nChannels);
62+
63+
nvidia::gxf::Shape shape = nvidia::gxf::Shape{height_, width_, nChannels};
64+
cudaMemcpy2D(*pointer,
65+
width_ * element_size * nChannels,
66+
static_cast<void*>((char*)tensor->data() + x_ * element_size * nChannels),
67+
orig_width * element_size * nChannels,
68+
width_ * element_size * nChannels,
69+
height_,
70+
cudaMemcpyDeviceToDevice);
71+
72+
auto out_message = nvidia::gxf::Entity::New(context.context());
73+
auto gxf_tensor = out_message.value().add<nvidia::gxf::Tensor>("");
74+
75+
gxf_tensor.value()->wrapMemory(shape,
76+
data_type,
77+
element_size,
78+
nvidia::gxf::ComputeTrivialStrides(shape, element_size),
79+
nvidia::gxf::MemoryStorageType::kDevice,
80+
*pointer,
81+
[orig_pointer = pointer](void*) mutable {
82+
orig_pointer.reset(); // decrement ref count
83+
return nvidia::gxf::Success;
84+
});
85+
86+
op_output.emit(out_message.value(), "output");
87+
}
88+
89+
} // namespace holoscan::ops

applications/stereo_vision/cpp/crop.h

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
/*
2+
* SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
* SPDX-License-Identifier: Apache-2.0
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
#ifndef OPERATORS_CROP
19+
#define OPERATORS_CROP
20+
21+
#include <holoscan/holoscan.hpp>
22+
#include <holoscan/utils/cuda_stream_handler.hpp>
23+
24+
25+
namespace holoscan::ops {
26+
27+
class CropOp : public Operator{
28+
public:
29+
HOLOSCAN_OPERATOR_FORWARD_ARGS(CropOp);
30+
CropOp() = default;
31+
void setup(OperatorSpec& spec) override;
32+
void compute(InputContext&, OutputContext& op_output, ExecutionContext&) override;
33+
private:
34+
Parameter<int> x_;
35+
Parameter<int> y_;
36+
Parameter<int> width_;
37+
Parameter<int> height_;
38+
};
39+
40+
} // namespace holoscan::ops
41+
#endif

0 commit comments

Comments
 (0)