Skip to content

Commit 37253f5

Browse files
sohamm17mdemircinwhom3
committed
Precompiled PVA kernel example
Co-authored-by: Mehmet Demircin <mdemircin@nvidia.com> Co-authored-by: Wendell Hom <whom@nvidia.com> Signed-off-by: sohams <sohams@nvidia.com>
1 parent fe2cfbf commit 37253f5

File tree

10 files changed

+461
-0
lines changed

10 files changed

+461
-0
lines changed

applications/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ add_holohub_application(object_detection_torch)
7070

7171
add_holohub_application(openigtlink_3dslicer DEPENDS OPERATORS openigtlink)
7272

73+
add_holohub_application(precompiled_pva)
74+
7375
add_holohub_application(prohawk_video_replayer DEPENDS OPERATORS prohawk_video_processing)
7476

7577
add_holohub_application(qt_video_replayer DEPENDS OPERATORS qt_video npp_filter)
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
# SPDX-License-Identifier: Apache-2.0
3+
#
4+
# Licensed under the Apache License, Version 2.0 (the "License");
5+
# you may not use this file except in compliance with the License.
6+
# You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
cmake_minimum_required(VERSION 3.20)
17+
project(precompiled_pva)
18+
19+
add_subdirectory(cpp)
Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# PVA-Accelerated Image Sharpening Application
2+
3+
This application demonstrates the usage of [Programmable Vision Accelerator (PVA)](#about-pva) within a Holoscan
4+
application. It reads a video stream, applies a 2D unsharp mask filter and renders it via the
5+
visualizer. The unsharp mask filtering operation is done in PVA. Since the PVA is used for this
6+
operation, the GPU workload is minimized. This example is a demonstration of how pre-processing, post-processing, and image processing tasks can be offloaded from a GPU, allowing it to concentrate on more compute-intensive machine learning and artificial intelligence tasks.
7+
8+
This example application processes a video stream, displaying two visualizer windows: one for the original stream and another for the stream enhanced with image sharpening via PVA.
9+
10+
## About PVA
11+
12+
PVA is a highly power-efficient VLIW processor integrated into NVIDIA Tegra platforms, specifically designed for advanced image processing and computer vision algorithms. The CUPVA SDK offers a comprehensive and unified programming model for PVA, enabling developers to create and optimize their own algorithms. For access to the SDK and further development opportunities, please contact NVIDIA.
13+
14+
## Content
15+
16+
- `main.cpp`: This file contains a C++ Holoscan application that demonstrates the use of an operator for loading and executing a precompiled PVA library dedicated to performing the unsharp masking algorithm on images. CUPVA SDK and license are not required to run this Holohub application.
17+
- `pva_unsharp_mask/`: This directory houses the `pva_unsharp_mask.hpp` header file, which declares the `PvaUnsharpMask` class. The `PvaUnsharpMask` class includes an `init` API, invoked for the initial tensor, and a `process` API, used for processing input tensors. Precompiled algorithm library file, `libpva_unsharp_mask.a`, and the corresponding allow list file, `cupva_allowlist_pva_unsharp_mask`, are automatically downloaded by the CMake scripts.
18+
19+
20+
## Algorithm Overview
21+
22+
The PreCompiledPVAExecutor operator performs an image sharpening operation in three steps:
23+
24+
1. Convert the input RGB image to the NV24 color format.
25+
2. Apply a 5x5 unsharp mask filter on the luminance color plane.
26+
3. Convert the enhanced image back to the RGB format.
27+
28+
The [VPI library](https://developer.nvidia.com/embedded/vpi) offers numerous algorithm examples that leverage the PVA as the backend.
29+
30+
## Compiling the application
31+
32+
Build the application inside docker
33+
34+
```
35+
$ ./dev_container build --img holohub:precompiled_pva --base_img nvcr.io/nvidia/clara-holoscan/holoscan:v2.1.0-dgpu --docker_file ./Dockerfile
36+
# Check which version of CUPVA is installed on your platform at /opt/nvidia
37+
$ ./dev_container launch --img holohub:precompiled_pva --docker_opts "-v /opt/nvidia/cupva-<version>:/opt/nvidia/cupva-<version> --device /dev/nvhost-ctrl-pva0:/dev/nvhost-ctrl-pva0 --device /dev/nvmap:/dev/nvmap --device /dev/dri/renderD129:/dev/dri/renderD129"
38+
```
39+
40+
Inside docker, add to your environment variable the following directories:
41+
```
42+
# inside docker
43+
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/aarch64-linux-gnu/tegra/:/opt/nvidia/cupva-2.5/lib/aarch64-linux-gnu/
44+
```
45+
46+
Build the application inside docker:
47+
```
48+
$ ./run build precompiled_pva
49+
```
50+
## Running the application
51+
52+
The application takes an endoscopy video stream as input, applies the unsharp mask filter, and shows it in
53+
HoloViz window.
54+
55+
Before running the application, deploy VPU application signature allow-list on target in your host (outside a container):
56+
```bash
57+
sudo cp <HOLOHUB_BUILD_DIR>/applications/precompiled_pva/cpp/pva_unsharp_mask/cupva_allowlist_pva_unsharp_mask /etc/pva/allow.d/cupva_allowlist_pva_unsharp_mask
58+
sudo pva_allow
59+
```
60+
61+
Run the same docker container you used to build your application
62+
63+
```
64+
$ ./dev_container launch --img holohub:precompiled_pva --docker_opts "-v /opt/nvidia/cupva-<version>:/opt/nvidia/cupva-<version> --device /dev/nvhost-ctrl-pva0:/dev/nvhost-ctrl-pva0 --device /dev/nvmap:/dev/nvmap --device /dev/dri/renderD129:/dev/dri/renderD129"
65+
66+
# inside docker
67+
# don't forget the line below to export the environment variables
68+
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/aarch64-linux-gnu/tegra/:/opt/nvidia/cupva-2.5/lib/aarch64-linux-gnu/
69+
$ ./run launch precompiled_pva
70+
```
71+
72+
73+
![PVA Example](pva_example.png)
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
#
2+
# Copyright (c) 2024, NVIDIA CORPORATION. All rights reserved.
3+
#
4+
# NVIDIA Corporation and its licensors retain all intellectual property
5+
# and proprietary rights in and to this software, related documentation
6+
# and any modifications thereto. Any use, reproduction, disclosure or
7+
# distribution of this software and related documentation without an express
8+
# license agreement from NVIDIA Corporation is strictly prohibited.
9+
#
10+
11+
find_package(holoscan 1.0.3 REQUIRED CONFIG
12+
PATHS "/opt/nvidia/holoscan" "/workspace/holoscan-sdk/install")
13+
14+
add_executable(precompiled_pva
15+
main.cpp
16+
)
17+
18+
target_link_libraries(precompiled_pva
19+
PRIVATE
20+
holoscan::core
21+
holoscan::ops::video_stream_replayer
22+
holoscan::ops::video_stream_recorder
23+
holoscan::ops::holoviz
24+
)
25+
26+
add_library(pva_unsharp_mask STATIC IMPORTED)
27+
28+
# Define the location in the build directory where libpva_unsharp_mask.a will be used
29+
set(PVA_UNSHARP_MASK_LIB_DEST "${CMAKE_CURRENT_BINARY_DIR}/pva_unsharp_mask/libpva_unsharp_mask.a")
30+
# Define the destination path in the build directory for cupva_allowlist_pva_unsharp_mask
31+
set(CUPVA_ALLOWLIST_DEST "${CMAKE_CURRENT_BINARY_DIR}/pva_unsharp_mask/cupva_allowlist_pva_unsharp_mask")
32+
33+
# Define the URL for downloading libpva_unsharp_mask.a if it's not found in the source directory
34+
set(PVA_UNSHARP_MASK_URL "https://edge.urm.nvidia.com/artifactory/sw-holoscan-thirdparty-generic-local/pva/libpva_unsharp_mask.a")
35+
# Define the URL for downloading cupva_allowlist_pva_unsharp_mask
36+
set(CUPVA_ALLOWLIST_URL "https://edge.urm.nvidia.com/artifactory/sw-holoscan-thirdparty-generic-local/pva/cupva_allowlist_pva_unsharp_mask")
37+
38+
# Define a custom target for preparing libpva_unsharp_mask.a and cupva_allowlist_pva_unsharp_mask
39+
add_custom_target(prepare_pva_dependencies
40+
COMMAND ${CMAKE_COMMAND} -E cmake_echo_color --cyan "Preparing PVA dependencies..."
41+
COMMAND ${CMAKE_COMMAND} -E make_directory "${CMAKE_CURRENT_BINARY_DIR}/pva_unsharp_mask"
42+
COMMAND ${CMAKE_COMMAND} -E cmake_echo_color --green "Directory ensured at ${CMAKE_CURRENT_BINARY_DIR}/pva_unsharp_mask"
43+
COMMAND ${CMAKE_COMMAND} -D PVA_UNSHARP_MASK_LIB_DEST="${PVA_UNSHARP_MASK_LIB_DEST}" -D PVA_UNSHARP_MASK_URL="${PVA_UNSHARP_MASK_URL}" -D CUPVA_ALLOWLIST_URL="${CUPVA_ALLOWLIST_URL}" -D CUPVA_ALLOWLIST_DEST="${CUPVA_ALLOWLIST_DEST}" -P "${CMAKE_CURRENT_LIST_DIR}/PreparePVADependencies.cmake"
44+
COMMENT "Preparing libpva_unsharp_mask.a and cupva_allowlist_pva_unsharp_mask"
45+
)
46+
47+
add_dependencies(pva_unsharp_mask prepare_pva_dependencies)
48+
49+
# Update the IMPORTED_LOCATION to the new path in the build directory
50+
set_target_properties(pva_unsharp_mask PROPERTIES IMPORTED_LOCATION ${PVA_UNSHARP_MASK_LIB_DEST})
51+
52+
# add according to your CUPVA version here
53+
find_library(CUPVAHOST_LIB libcupva_host.so.2.5 PATHS /opt/nvidia/cupva-2.5/lib/aarch64-linux-gnu/ REQUIRED)
54+
55+
target_link_libraries(precompiled_pva
56+
PUBLIC
57+
pva_unsharp_mask
58+
${CUPVAHOST_LIB}
59+
)
60+
61+
# Copy the config to the binary directory
62+
add_custom_target(precompiled_pva_deps
63+
COMMAND ${CMAKE_COMMAND} -E copy_if_different "${CMAKE_CURRENT_SOURCE_DIR}/main.yaml" ${CMAKE_CURRENT_BINARY_DIR}
64+
DEPENDS "main.yaml"
65+
BYPRODUCTS "main.yaml"
66+
)
67+
add_dependencies(precompiled_pva precompiled_pva_deps)
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
if(NOT EXISTS "${PVA_UNSHARP_MASK_LIB_DEST}")
2+
# Download libpva_unsharp_mask.a using curl
3+
message(STATUS "libpva_unsharp_mask.a not found in source directory. Downloading from ${PVA_UNSHARP_MASK_URL} using curl")
4+
execute_process(COMMAND curl -L ${PVA_UNSHARP_MASK_URL} -o ${PVA_UNSHARP_MASK_LIB_DEST}
5+
RESULT_VARIABLE result
6+
OUTPUT_QUIET)
7+
if(NOT result EQUAL "0")
8+
message(FATAL_ERROR "Error downloading libpva_unsharp_mask.a using curl")
9+
endif()
10+
# Check if the downloaded file contains a "File not found" error message
11+
file(READ ${PVA_UNSHARP_MASK_LIB_DEST} contents)
12+
if(contents MATCHES "\"status\" : 404")
13+
message(FATAL_ERROR "Downloaded file contains a 'File not found' error. Please check the URL and try again.")
14+
endif()
15+
# Download cupva_allowlist_pva_unsharp_mask using curl
16+
message(STATUS "Downloading cupva_allowlist_pva_unsharp_mask from ${CUPVA_ALLOWLIST_URL} using curl")
17+
execute_process(COMMAND curl -L ${CUPVA_ALLOWLIST_URL} -o ${CUPVA_ALLOWLIST_DEST}
18+
RESULT_VARIABLE result_allowlist
19+
OUTPUT_QUIET)
20+
if(NOT result_allowlist EQUAL "0")
21+
message(FATAL_ERROR "Error downloading cupva_allowlist_pva_unsharp_mask using curl")
22+
endif()
23+
# Check if the downloaded file contains a "File not found" error message
24+
file(READ ${CUPVA_ALLOWLIST_DEST} contents_allowlist)
25+
if(contents_allowlist MATCHES "\"status\" : 404")
26+
message(FATAL_ERROR "Downloaded cupva_allowlist_pva_unsharp_mask contains a 'File not found' error. Please check the URL and try again.")
27+
endif()
28+
endif()
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
/*
2+
* SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
* SPDX-License-Identifier: Apache-2.0
4+
*
5+
* Licensed under the Apache License, Version 2.0 (the "License");
6+
* you may not use this file except in compliance with the License.
7+
* You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
#include "gxf/std/tensor.hpp"
19+
#include "holoscan/holoscan.hpp"
20+
#include "pva_unsharp_mask/pva_unsharp_mask.hpp"
21+
22+
#include <holoscan/operators/holoviz/holoviz.hpp>
23+
#include <holoscan/operators/video_stream_recorder/video_stream_recorder.hpp>
24+
#include <holoscan/operators/video_stream_replayer/video_stream_replayer.hpp>
25+
#include <holoscan/core/system/gpu_resource_monitor.hpp>
26+
27+
#include <iostream>
28+
#include <string>
29+
30+
namespace holoscan::ops {
31+
class PreCompiledPVAExecutor : public Operator {
32+
public:
33+
HOLOSCAN_OPERATOR_FORWARD_ARGS(PreCompiledPVAExecutor);
34+
PreCompiledPVAExecutor() = default;
35+
36+
void setup(OperatorSpec& spec) override {
37+
spec.param(allocator_, "allocator", "Allocator", "Allocator to allocate output tensor.");
38+
spec.input<gxf::Entity>("input");
39+
spec.output<gxf::Entity>("output");
40+
}
41+
void compute(InputContext& op_input, OutputContext& op_output,
42+
ExecutionContext& context) override {
43+
auto maybe_input_message = op_input.receive<gxf::Entity>("input");
44+
if (!maybe_input_message.has_value()) {
45+
HOLOSCAN_LOG_ERROR("Failed to receive input message gxf::Entity");
46+
return;
47+
}
48+
auto input_tensor = maybe_input_message.value().get<holoscan::Tensor>();
49+
if (!input_tensor) {
50+
HOLOSCAN_LOG_ERROR("Failed to receive holoscan::Tensor from input message gxf::Entity");
51+
return;
52+
}
53+
54+
// get handle to underlying nvidia::gxf::Allocator from std::shared_ptr<holoscan::Allocator>
55+
auto allocator = nvidia::gxf::Handle<nvidia::gxf::Allocator>::Create(
56+
fragment()->executor().context(), allocator_->gxf_cid());
57+
58+
// cast Holoscan::Tensor to nvidia::gxf::Tensor to use its APIs directly
59+
nvidia::gxf::Tensor input_tensor_gxf{input_tensor->dl_ctx()};
60+
61+
auto out_message = CreateTensorMap(
62+
context.context(),
63+
allocator.value(),
64+
{{"output",
65+
nvidia::gxf::MemoryStorageType::kDevice,
66+
input_tensor_gxf.shape(),
67+
nvidia::gxf::PrimitiveType::kUnsigned8,
68+
0,
69+
nvidia::gxf::ComputeTrivialStrides(
70+
input_tensor_gxf.shape(),
71+
nvidia::gxf::PrimitiveTypeSize(nvidia::gxf::PrimitiveType::kUnsigned8))}},
72+
false);
73+
74+
if (!out_message) { std::runtime_error("failed to create out_message"); }
75+
const auto output_tensor = out_message.value().get<nvidia::gxf::Tensor>();
76+
if (!output_tensor) { std::runtime_error("failed to create out_tensor"); }
77+
78+
uint8_t* input_tensor_data = static_cast<uint8_t*>(input_tensor->data());
79+
uint8_t* output_tensor_data = static_cast<uint8_t*>(output_tensor.value()->pointer());
80+
if (output_tensor_data == nullptr) {
81+
throw std::runtime_error("Failed to allocate memory for the output image");
82+
}
83+
84+
const int32_t imageWidth{static_cast<int32_t>(input_tensor->shape()[1])};
85+
const int32_t imageHeight{static_cast<int32_t>(input_tensor->shape()[0])};
86+
const int32_t inputLinePitch{static_cast<int32_t>(input_tensor->shape()[1])};
87+
const int32_t outputLinePitch{static_cast<int32_t>(input_tensor->shape()[1])};
88+
89+
if (!pvaOperatorTask_.isInitialized()) {
90+
pvaOperatorTask_.init(imageWidth, imageHeight, inputLinePitch, outputLinePitch);
91+
}
92+
pvaOperatorTask_.process(input_tensor_data, output_tensor_data);
93+
auto result = gxf::Entity(std::move(out_message.value()));
94+
95+
op_output.emit(result, "output");
96+
}
97+
98+
private:
99+
Parameter<std::shared_ptr<Allocator>> allocator_;
100+
PvaUnsharpMask pvaOperatorTask_;
101+
};
102+
} // namespace holoscan::ops
103+
104+
class App : public holoscan::Application {
105+
public:
106+
void compose() override {
107+
using namespace holoscan;
108+
109+
uint32_t max_width{1920};
110+
uint32_t max_height{1080};
111+
int64_t source_block_size = max_width * max_height * 3;
112+
113+
std::shared_ptr<BlockMemoryPool> pva_allocator =
114+
make_resource<BlockMemoryPool>("allocator", 1, source_block_size, 1);
115+
116+
auto precompiledpva = make_operator<ops::PreCompiledPVAExecutor>(
117+
"precompiledpva", Arg("allocator") = pva_allocator);
118+
119+
auto source = make_operator<ops::VideoStreamReplayerOp>("replayer", from_config("replayer"));
120+
121+
auto recorder = make_operator<ops::VideoStreamRecorderOp>("recorder", from_config("recorder"));
122+
auto visualizer1 = make_operator<ops::HolovizOp>(
123+
"holoviz1", from_config("holoviz"), Arg("window_title") = std::string("Original Stream"));
124+
auto visualizer2 =
125+
make_operator<ops::HolovizOp>("holoviz2",
126+
from_config("holoviz"),
127+
Arg("window_title") = std::string("Image Sharpened Stream"));
128+
129+
add_flow(source, precompiledpva);
130+
add_flow(source, visualizer1, {{"output", "receivers"}});
131+
// add_flow(precompiledpva, recorder);
132+
add_flow(precompiledpva, visualizer2, {{"output", "receivers"}});
133+
}
134+
};
135+
136+
int main(int argc, char** argv) {
137+
auto app = holoscan::make_application<App>();
138+
139+
auto config_path = std::filesystem::canonical(argv[0]).parent_path();
140+
config_path += "/main.yaml";
141+
app->config(config_path);
142+
143+
app->run();
144+
145+
return 0;
146+
}
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
%YAML 1.2
2+
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
# SPDX-License-Identifier: Apache-2.0
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
---
17+
extensions:
18+
- libgxf_std.so
19+
- libgxf_cuda.so
20+
- libgxf_multimedia.so
21+
- libgxf_serialization.so
22+
23+
replayer:
24+
directory: /workspace/holohub/data/endoscopy
25+
basename: "surgical_video"
26+
frame_rate: 0 # as specified in timestamps
27+
repeat: true # default: false
28+
realtime: true # default: true
29+
count: 0 # default: 0 (no frame count restriction)
30+
31+
recorder:
32+
directory: "/tmp"
33+
basename: "surgical_video_sharpened"
34+
35+
holoviz:
36+
width: 854
37+
height: 480
38+
39+

0 commit comments

Comments
 (0)