Skip to content

Data flow descriptor

J-Loudet edited this page Jan 23, 2023 · 4 revisions

Table of contents

  1. (optional) Vars
  2. (optional) Configuration
  3. Sources
  4. Operators
  5. Sinks
  6. Links
  7. (optional) Mapping
    1. No mapping: single, randomly-selected, daemon
    2. Name-based mapping

The data flow descriptor tells Zenoh-Flow how to deploy the application. It provides the following information:

  • which nodes are involved and where to find their descriptors,
  • how the nodes are connected,
  • where the nodes should run.

Below is a simple data flow descriptor that we will use to explain the different sections that compose it. Optional sections are indicated as such.

id: my-first-flow

# (optional)
vars:
  BASE_PATH: /home/zenoh/my-first-flow/nodes
  
# (optional)
configuration:
  default_timeout: 5
  
sources:
- id: foo
  descriptor: "file://{{ BASE_PATH }}/foo.yml"
    
operators:
- id: bar
  descriptor: "file://{{ BASE_PATH }}/bar.yml"
    
sinks:
- id: baz
  descriptor: "file://{{ BASE_PATH }}/baz.yml"
    
links:
- from:
    id: foo
    output: out
  to:
    id: bar
    input: in
- from:
    id: bar
    output: out
  to:
    id: baz
    input: in
    
# (optional)
mapping:
  foo: Abondance
  bar: Brie
  baz: Camembert

Let us now explain what each section does.

(optional) Vars

This section is used to tell Zenoh-Flow how to do string replacements in this descriptor (and only this one). More details can be found here.

(optional) Configuration

This section allows passing a dictionary of key-value pairs to all the nodes involved. This can be useful, for instance, to run several times the same node but with slightly different parameters or to modify the behaviour of a node without having to recompile it.

An in-depth explanation can be found here.

Sources

This section groups the declaration of all the sources used. Each declaration must specify:

  • a unique id (that can be different from the one used in the descriptor),
  • a uri indicating where to find the descriptor,
  • (optional) a configuration section that will only apply to this source and potentially override the configuration present in its descriptor.

The ids used in this section will override those present in the descriptors. They are also the id expected in the links section.

Operators

The same rule apply as for the Sources (and the Sinks).

Sinks

The same rule apply as for the Sources (and the Operators).

Links

The links section in a data flow descriptor describes how the different nodes are connected. A link goes from the output of a node to the input of another one.

Hence, each link is composed of two subsections:

  • a from subsection that contains:
    • the id of the node which is sending data,
    • the output,
  • a to subsection that contains:
    • the id of the node which is receiving data,
    • the input.

⚠️ The id of a node is found in the data flow descriptor while the input and output are found in the descriptor of the node.

The following description connects the output out of foo to the input in of bar:

- from:
    id: foo
    output: out
  to:
    id: bar
    input: in

Given that are possibly multiple links in a data flow, Zenoh-Flow expects a list and thus each link must be prepended by a dash:

links:

# Link from foo to bar
- from:
    id: foo
    output: out
  to:
    id: bar
    input: in

# Link from bar to baz
- from:
    id: bar
    output: out
  to:
    id: baz
    input: in

💡 Zenoh-Flow will check the validity of links before instantiating a flow. In particular, it will ensure that:

  1. all links go from an output to an input,
  2. the output and input of the same link have the same type (or at least one of these types has the special value _any_),
  3. all ports are connected --- i.e. no node has an input or an output that is not connected.

There are no additional constraints on the links: loops are accepted, the same output can go to multiple inputs, several outputs can go to the same input, etc.

(optional) Mapping

Zenoh-Flow leverages this section to control on which daemons the different parts of a data flow run.

No mapping: single, randomly-selected, daemon

If the mapping section is absent, Zenoh-Flow will default to running all the nodes on one daemon. This daemon is randomly selected if several are available.

Name-based mapping

At this stage of the development of Zenoh-Flow, to deploy a data flow, each daemon must have access to the shared library or scripts. In other words, the uri field in the descriptor file of the node must point to an accessible location on the file system where the daemon is running.

The same holds for the descriptor fields pointing to the descriptors.

Hence, if a data flow should be deployed on several daemons, the shared library or scripts must be uploaded on the device where the daemons are running and the different paths updated accordingly: (i) the path of the descriptors and, in each descriptor, (ii) the path of the implementation of the node. Note that, a daemon only needs access to the nodes it runs.

Assuming that the paths are correct and the implementations present on the device, to inform Zenoh-Flow of where each node should run one needs to write a mapping.

mapping:
  my-source: foo
  my-operator: bar
  my-sink: baz

The above mapping indicates that:

  • the node whose id is equal to my-source should run on the daemon whose name is equal to foo,
  • the node whose id is equal to my-operator should run on the daemon whose name is equal to bar,
  • the node whose id is equal to my-sink should run on the daemon whose name is equal to baz.

The ids are given to the nodes in the data flow descriptor. The names are given to the daemons in their respective configuration file.

⚠️ All nodes that are absent from the mapping will run, by default, on the randomly selected daemon!