Skip to content

Latest commit

 

History

History

serve

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

SSDF-serve

This is a C++ serving library for modern C++, which provide a simple interface for working with neural network models. Right now it is having those features:

Prerequisite

  • Nvidia TensorRT
  • fmt (until C++20)
  • C++17

Create your custom logger

Some inference backends (i.e. TensorRT) requires a custom logger. To provide a robust logging system, this library request user to provide their definition for flushLog function in ILogger inferface.

void Logger::flushLog(Level level, std::string_view message) const {
  switch (level) {
    case Level::kINTERNAL_ERROR:
      ROS_FATAL_STREAM_NAMED(name_, message);
      break;
    ...
    case Level::kVERBOSE:
      ROS_DEBUG_STREAM_NAMED(name_, message);
      break;
  }
}

Usage

Implemented backend

  • TensorRT: .engine, .trt

Convert ONNX to TensorRT engine

  1. Populate BuildOptions and SystemOptions in option.hpp
  2. Create a Generator instance with those options, then use generator.getSerializedEngine to allocate memory for the network. Right now it could only take ONNX model path; using TensorRT's layers is under development. This function return raw pointer, remember to delete it after saving the model (or use smart pointer)
  3. Use saveEngine function in utils.hpp to save allocated model to file.
// Suppose that we use the Logger above
auto logger = std::make_shared<Logger>();

// Use default options
Generator generator{BuildOptions(), SystemOptions(), logger};
std::unique_ptr<nvinfer1::IHostMemory> serialized_engine{generator.getSerializedEngine("path_to_onnx")};
saveEngine(*serialized_engine, "path_to_save_engine");

Perform inference

  1. Create a Session instance by using InferenceOptions. This class will automatically choose the right backend using model's file extension
  2. Call session.doInference to perform synchronous inference. This function receives a map, which its key is input layer's name and value are host's pointer + size in bytes. The output results in also a map of (layer's name, buffer in host)
auto logger = std::make_shared<Logger>();
InferenceOptions options;
options.model_path = "path_to_model";
session = Session(options, logger);

// mat is cv::Mat image, 1 input layer named "input"
auto outputs = session_->doInference({{"input", {mat.data, mat.total() * mat.elemSize()}}});

std::vector<uint8_t> &out_tensor = outputs["output"];
// out_tensor is a buffer, please cast it to expected output type

Add custom backend

  1. Provide implementation for IBackend interface
  2. Register the new backend and its file extensions in .cpp file, see bool registered in backend.cpp for example.