Config Service

Configuration and Persistence

Michael Haberler, 3/2014

Problem statement

Currently LinuxCNC uses a INI file format for configuration information. This works as long as all components desiring configuration information share a file system. This cannot be assumed in a setup where several, possibly heterogenuous hosts are involved in running an application. For instance, it is unclear what happens when an INI item refers to a filename (e.g. 'HALFILE = foo.hal'). In the distributed scenario a client may not be able to open(2) this file, which is currently assumed; in that case, a config server either needs to provide the file contents instead of the filename; or it makes the file accessible in some other way.

Related to this is value persistence (for instance, interpreter state), which is saved on exit, and reloaded on startup.

A third aspect is propagation of value changes. So far, after editing an INI file, a restart of the application was required to bring the new settings into effect. This can be improved by notifying config clients of a value change they might be interested in. For instance, this would enable 'live changes' of tunable parameters, including persistence across sessions.

Assuming INI-like files are continued to be the primary file format, these goals are partially in conflict:

comment retention: updating a hand-edited INI file by a library without breaking comments is hard, and barely any library does this (and if so, under all conditions and without mutilation)
file contents: INI-files are simple key/value stores - there is no provision for tagging items with attributes (like 'this is a file')
repeating groups: very few INI parsers have the ability to store, retrieve and update a list of values associated with a key.
flat structure: the classic INI file format is a single-level list of sections, each section with a list of items (key/value pairs). This is a very restrictive layout, making for a unwieldy organisation of the namespace.

However, the LinuxCNC community is very used to INI-like formats, and deviating radically here will meet resistance, require a learning curve, and support effort.

Requirements for a configuration & persistence vehicle

it MUST be possible to autodiscover the config service remotely, based on a unique ID.
it MUST be possible to retrieve all information - k/v
It SHOULD be possible to use a INI-file like file format for external storage.
The namespace SHOULD be organized in a tree structure of arbitrary depth (section names with an arbitrary number of subsections)
Any section MUST have a unique string as a key (for example '/section1/subsection2')
It SHOULD be possible to create and update items on the fly, at least in some designated sections
Changes to items must persist invocations, at least in some designated sections
It SHOULD be possible to 'watch' an item for change, and receive notification on change
client side API’s:
- C, Python MUST be available
- JSON/Websockets or JSON/HTTP SHOULD be available
It SHOULD be possible to view and edit config information over a Web UI

Flows

program config values: programs at startup typically query configuration values, possibly merging overrides from the command line. Simple key/value access, section known through program name or command line option.
tunable parameters: some program might query a value like above, but declare interest in updates to this value, for instance through a callback. This requires the config store to notify any subscribers of a value change, and library support to enable the client to act acordingly.
persistence: some programs require values to be retained across invocation; this means storing all key/value pairs and reloading on program start.
discovery and selection: some programs might need to discover 'what is there' by querying the config store repeatedly. For instance, a generic UI program like gladevcp might provide an option to show a list of configurations, and select one to execute. This would suggest querying a designated section at startup, and subsequent sections thereafter (post user selection).

Existing software: candidates

There is a number of candidate packages and libraries which matches the requirements more or less:

Apache ZooKeeper’s data model matches the above requirements best, and many client bindings exist. Unfortunately it is a Java application, and goes way beyond what is required here (replication, atomic updates etc). Stable. Has change notification support.
etcd comes close in data model and features as well. Format is JSON only, the API is HTTP, so a http client library is needed. etcd is written in go, which is a substantial external dependency. Rather young software. Has change notification support.
Redis is a networked key/value store (aka 'noSQL database'). A tree-structured data model needs to be layered on top of redis, suggesting direct use of the Redis API may not be a good idea. Stable, established user base and community. Has low-level change notification support which can be built upon. Data model could be implemented by using the embedded LUA support in redis.
webdis is a HTTP/JSON layer over redis. Written in C. While webdis takes care of the JSON/HTTP access path, a C API again would need a http layer, or webdis needs to be added a zeroMQ/protobuf layer. Only about 2500 lines of C, so likely possible. Does expose the raw Redis API over URI’s. Data model could be implemented by using the embedded LUA support in redis.
The Pub-Sub-Clone pattern in the zeroMQ Guide example. This is very close in data model, interfaces and functionality. It has no provisions for HTTP/JSON but zeromq/JSON translated to websockets should be easy to handle given the existing infrastructure. It is not a full-blown server, just a manual example right now.
TOML basically an INI file parser on steroids, with lots of bindings. Not a networked service, so no change notification support. Good fit on data model. Seems quite in flux, sketchy documentation.

API consequences of using existing packages

LinuxCNC classic uses an inifile access library.

In the distributed case, this could be mapped onto a zeroMQ/protobuf vehicle, which is ok for C and Python clients; for JSON/websockets config access the existing proxy JSON/ws <→ zeromq/protobuf could be used.

When using zookeeper, etcd, redis or webis, a client C and Python library must be either chosen or written, and the protocol would be whatever the package provides; eg http, JSON.

Architecture notes

The way I am thinking about this can be depicted like so:

The idea is to provide a selective replication of state (you might think of this as sections with key-value pairs in an ini file). A using party would subscribe to 'its' section, and receive the pertinent set of key-value pairs - on whatever host that party happens to reside.

Parties may not only be read-only consumers, but also updaters of state: if a party chooses to modify a value, then that change is made available immediately through a zeroMQ update.

This state replication engine is also the natural place to provide persistent storage, like for instance interpreter parameters or machine positions which are expected to survive a shutdown/restart sequence. One option is to hook redis in here; note it does not matter which host this replications service, and redis would run.

This scheme would cover all of configuration, live tuning of parameters, and persistent storage through a single API, and get rid of the need for replicated config files, as well as current ad-hoc mechanisms for persistent state storage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly