Name		Name	Last commit message	Last commit date
parent directory ..
cmake		cmake
include		include
src		src
CMakeLists.txt		CMakeLists.txt
README.md		README.md

README.md

SSDF-serve

This is a C++ serving library for modern C++, which provide a simple interface for working with neural network models. Right now it is having those features:

Generate TensorRT engine
Perform inference in multiple backends

Prerequisite

Nvidia TensorRT
fmt (until C++20)
C++17

Create your custom logger

Some inference backends (i.e. TensorRT) requires a custom logger. To provide a robust logging system, this library request user to provide their definition for flushLog function in ILogger inferface.

void Logger::flushLog(Level level, std::string_view message) const {
  switch (level) {
    case Level::kINTERNAL_ERROR:
      ROS_FATAL_STREAM_NAMED(name_, message);
      break;
    ...
    case Level::kVERBOSE:
      ROS_DEBUG_STREAM_NAMED(name_, message);
      break;
  }
}

Usage

Implemented backend

TensorRT: .engine, .trt

Convert ONNX to TensorRT engine

Populate BuildOptions and SystemOptions in option.hpp
Create a Generator instance with those options, then use generator.getSerializedEngine to allocate memory for the network. Right now it could only take ONNX model path; using TensorRT's layers is under development. This function return raw pointer, remember to delete it after saving the model (or use smart pointer)
Use saveEngine function in utils.hpp to save allocated model to file.

// Suppose that we use the Logger above
auto logger = std::make_shared<Logger>();

// Use default options
Generator generator{BuildOptions(), SystemOptions(), logger};
std::unique_ptr<nvinfer1::IHostMemory> serialized_engine{generator.getSerializedEngine("path_to_onnx")};
saveEngine(*serialized_engine, "path_to_save_engine");

Perform inference

Create a Session instance by using InferenceOptions. This class will automatically choose the right backend using model's file extension
Call session.doInference to perform synchronous inference. This function receives a map, which its key is input layer's name and value are host's pointer + size in bytes. The output results in also a map of (layer's name, buffer in host)

auto logger = std::make_shared<Logger>();
InferenceOptions options;
options.model_path = "path_to_model";
session = Session(options, logger);

// mat is cv::Mat image, 1 input layer named "input"
auto outputs = session_->doInference({{"input", {mat.data, mat.total() * mat.elemSize()}}});

std::vector<uint8_t> &out_tensor = outputs["output"];
// out_tensor is a buffer, please cast it to expected output type

Add custom backend

Provide implementation for IBackend interface
Register the new backend and its file extensions in .cpp file, see bool registered in backend.cpp for example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

serve

serve

README.md

SSDF-serve

Prerequisite

Create your custom logger

Usage

Implemented backend

Convert ONNX to TensorRT engine

Perform inference

Add custom backend

Files

serve

Directory actions

More options

Directory actions

More options

Latest commit

History

serve

Folders and files

parent directory

README.md

SSDF-serve

Prerequisite

Create your custom logger

Usage

Implemented backend

Convert ONNX to TensorRT engine

Perform inference

Add custom backend