Skip to content
Ákos Milánkovich edited this page Jan 13, 2020 · 9 revisions

CPSwarm Communication library

Introduction

The Communication Library provides a unified interface tools and swarm members can use to interact with each other. It is the duty of the library to ensure that all communications happen with the desired reliability, security level and latency. After evaluating the requirements established by our core use cases and the design goals of a swarm in general, we concluded that interactions have a well-defined set of primitives and actions:

  • Swarm members need to be discoverable on the network
  • Events and commands need to be sent and received
  • Parameters need to be remotely adjustable
  • Telemetry needs to be sent back to operators and other subscribers

Our aim is to provide a stable API for all tools that abstracts away the physical layer and the authentication scheme used. To do this, a pluggable architecture was designed for the Communication Library, which separates the logical layer responsible for implementing these primitives and the endpoint implementation capable of sending individual messages over the network. This extensible infrastructure makes it possible to add support for new low-level protocols, physical layers and security schemes without affecting the rest of the system. As a first step, the Zyre protocol was integrated with the library, but as the project progresses, a secure endpoint will be added as well.

Key concepts

High level C++ API

  • Abstracts away the physical and transport layers
  • Responsible for reliable delivery and fault detection
  • Exposes functionality through services

Protobuf based serialization

  • The API works with Protobuf objects directly
  • Objects can be reserialized at any points
  • Complex data types can be defined (area, route, etc.)

Cross-platform

  • Uses only C++ standard library primitives and other cross-platform libraries
  • Compatible with ROS

Components

Library

The Library contains the required source code for integrating into your project.

Endpoints

  • Used to abstract away the transport and physical layer
  • Endpoint implementations based on BasicEndpoint only need to implement:
    • Starting and stopping the endpoint
    • Sending binary messages
    • Receiving binary messages
    • Tracking the presence of nodes
  • While we are targeting IP networks (including mesh networks), the library doesn’t care about the medium
  • Zyre based implementation is available now
  • Work in progress to introduce the secure version based on libsodium

Services

  • A combination of well-defined functionality and related data types
  • Asynchronous, thread-safe interfaces
  • Currently available services:
    • Discovery The Discovery Service is responsible for detecting the supported features of participating swarm members. In order to make the Monitoring and Configuration Tool a universal tool for the management of compatible swarms, regardless of specific behavior or target hardware, the Communication Library provides a way to obtain a description of the events supported by each member and of the different telemetry and parameter values and their underlying data types. The Discovery Service works on two layers: the lower layer, provided by the specific endpoint implementation, is purely responsible for detecting the presence of swarm members and tracking their online-offline states – while the higher layer can request and answer, as well as cache and invalidate information about the supported facilities.
    • Event (commands and global events) Swarm members send and receive events as their behavior is executing – informing other members of important events and reacting to external and internal stimuli in order to change or modify the current state of execution. An event, on its own, has only a name and a list of parameters – it is only how the behavior reacts that makes the event meaningful. As such, events can represent commands issued by the operator, real events happening on a local or remote node or other simple messages that aid coordination. The Monitoring and Configuration Tool can use the Event Service to send arbitrary events to swarm members (in order to issue commands) and can monitor events as they are happening on swarm members.
    • Key-Value (parameters and other mostly static data) Parameters such as the operational area or the location of known obstacles are subject to change during deployment, and as such, need a way to be set during the mission. The Key-Value Service provides a way to write (and read) complex named values on swarm members – values the behavior can use to perform calculation and make decisions. The Monitoring and Configuration Tool uses the Key-Value Service to retrieve and set the parameters that govern swarm member behavior.
    • Telemetry (streaming data to subscribers) For the operator to receive meaningful information about the state of each swarm member, a continuous stream of information needs to be sent by the swarm members being monitored to the Monitoring and Configuration Tool, and eventually, to the operator. The Telemetry Service can be used to subscribe to such information on-demand, specifying the required resolution and scope of the information. All data sent back is strongly typed and can have complex schema. Each telemetry value (however complex) is treated as an atomic value relevant to a single time point. The Monitoring and Configuration Tool uses the Telemetry Service to display and visualize the key elements describing the state of individual swarm members.
    • Ping (measuring latency)

Usage: see Simulator

ROS bridge

Since our primary targets are ROS based devices, support for ROS native facilities needed to be a lot more in-depth. Communications within ROS use a proprietary messaging format and protocol – support for this is not available on non-ROS systems. A bridge node was developed, which is capable of translating between a ROS based system and the rest of the world:

  • Publishing any ROS topic as telemetry
  • Forwarding events to and from the behavior
  • Setting parameters on the ROS Parameter Server

Using the communication node, applications developed or behavior generated for ROS based devices can use native ROS facilities and need not care about the presence of the library. The bridge is just another application using the library – it receives no special treatment.

  • Using standard ROS facilities to communicate
    • Bridge the Key-Value Service to ROS Parameter Server
    • Transfer events and telemetry through ROS publish-subscribe
  • Bind to ROS resources as defined in a configuration file
    • Should be part of the deployment package
    • Reloadable without interruption
  • Should be one of the first things to install on a node during provisioning
    • Cryptographic proof of swarm membership will need to be established
    • Network interfaces need to be configured

An example configuration can be seen here.

Simulator

The Swarmio-Simulator is an example of using the Library with ZyreEndpoint capable of discovering and sending a predefined telemetry message simulating a linear-path movement.

Tool

The Swarmio-Tool is a management component with the following available commands:

  • members: lists the members of the swarm
  • rediscover: sends a discover message to trigger rediscover process
  • info
  • select [MID]: selects a member by ID for further commands
  • event NAME [KEY=VALUE]: sends an event with key, value pair
  • get KEY: requests a key
  • set KEY=VALUE: sets the corresponding value of a key to VALUE
  • subscriptions: lists the current subscriptions
  • subscribe [key=KEY] [interval=N]: subscribes the tool to a specific key to receive updates on that topic
  • unsubscribe [SID]: unsubscribe from a previous subscription by subscription id
  • ping [SIZE]: sends a ping message with SIZE size
  • help: displays this list
  • log: lists the log entries of communication
  • exit: stops the program

Security

The security functionalities are provided by libsodium (based on NaCl) Libsodium is a popular solution for crypto library used by e.g.: WordPress, Discord, Secrets, Remembear. All cryptographic functions are based on:

  • Edwards-Curve Digital Signature Algorithm (EdDSA)
  • Encryption: XSalsa20 stream cipher
  • Authentication: Poly1305 MAC

The following security dimensions are addressed:

  • Deployment tool is able to securely provision new node members (by generating their keys and signing their certificates)
  • Access control is provided for provisioned nodes by certificate checking, using a pre-shared signing key
  • Authentication is provided by signature checking
  • Non-repudiation is provided by signature and timestamp checking for each packet
  • Confidentiality is provided end-to-end by payload encryption
  • Integrity checking is provided by using a tag for packet integrity
  • Availability is maintained using each nodes security table, which stores valid authentication credentials.

All the security features are activated by adding a config.json file next to the executable with the following contents (example):

{
	"privateKey": "edKFMPh+t2gmqaFzCuvzYXqZj+5PVSFDnNbdtBagAp8B2ThBor8DvfettHVafxnDZgEgaymv43qmfhl+UBPIpQ==",
	"publicKey": "Hh8yd4dIUbeNM90xEhtdmjP12hcQPGepfcIeyEq8zfE=",
	"signature": "kgJnRa/7/KrErXZ1lobmV/XVcacE94A1+KwWfxF4zjbroafwrU0PYkEZEC0C5afT6XJDNS/q3TjZun3F6gxXBA==",
	"ca": "utGq0AIAV+c3JmwOsS5/h2T6mp331GD8WQhMzcPyzGs="
}

In case the security features are activated, the nodes are only able to communicate in a secure way (there is no selective approach to special message types).

Clone this wiki locally