This crate is experimental and is being developed as a personal learning exercise for getting acquainted with Rust and parsing in general. There are likely more performant and stable libraries out there for parsing CDDL. This crate should not be used in production in any form or fashion.
A Rust implementation of the Concise data definition language (CDDL). CDDL is an IETF standard that "proposes a notational convention to express CBOR and JSON data structures." As of 2019-06-12, it is published as RFC 8610 (Proposed Standard) at https://tools.ietf.org/html/rfc8610.
This crate includes a handwritten parser and lexer for CDDL, and its development has been heavily inspired by the techniques outlined in Thorsten Ball's book "Writing An Interpretor In Go". The AST has been built to closely match the rules defined by the ABNF grammar in Appendix B. of the spec. All CDDL must use UTF-8 for its encoding per the spec.
This crate supports partial validation of both CBOR and JSON data structures. An extremely basic REPL is included as well. This crate's minimum supported Rust version (MSRV) is 1.37.0.
Also bundled into this repository is a basic language server implementation and extension for Visual Studio Code for editing CDDL. The implementation is backed by the compiled WebAssembly target included in this crate.
- Parse CDDL documents into an AST
- Verify conformance of CDDL documents against RFC 8610
- Validate CBOR data structures
- Validate JSON documents
- Basic REPL
- Generate dummy JSON from conformant CDDL
- As close to zero-copy as possible
- Compile WebAssembly target for browser and Node.js
no_std
support (lexing and parsing only)- Language server implementation and Visual Studio Code Extension
- Performance (if this crate gains enough traction, it may be prudent to conduct more formal profiling and/or explore using a parser-combinator framework like nom)
- Support CBOR diagnostic notation
- I-JSON compatibility
Rust is a systems programming language designed around safety and is ideally-suited for resource-constrained systems. CDDL and CBOR are designed around small code and message sizes and constrained nodes, scenarios for which Rust has also been designed.
A CLI has been made available for various platforms and as a Docker image. It can downloaded from the Releases tab. The tool supports parsing of .cddl
files for verifying conformance against RFC 8610. It also supports validation of .cddl
documents against .json
files. Detailed information about the JSON validation functions can be found in the validating JSON section below. Instructions for using the tool can be viewed by executing the help
subcommand:
$ cddl help
If using Docker:
Replace
<version>
with an appropriate release tag. Requires use of the--volume
argument for mounting.cddl
and.json
documents into the container when executing the command. The command below assumes these documents are in your current working directory.
$ docker run -it --rm -v $PWD:/cddl -w /cddl docker.pkg.github.com/anweiss/cddl/cddl:<version> help
You can also find a simple RFC 8610 conformance tool at https://cddl.anweiss.tech. This same codebase has been compiled for use in the browser via WebAssembly.
An extension for editing CDDL documents with Visual Studio Code has been published to the Marketplace here. You can find more information in the README.
- maps
- structs
- tables
- cuts
- groups
- arrays
- values
- choices
- ranges
- enumeration (building a choice from a group)
- root type
- occurrence
- predefined types
- tags
- unwrapping
- controls
- socket/plug
- generics
- operator precedence
- comments
- numerical int/uint values
- numerical hexfloat values
- numerical values with exponents
- unprefixed byte strings
- prefixed byte strings
Incomplete. Under development
This crate uses the Serde framework, and more specifically, the serde_json crate, for parsing and validating JSON. Serde was chosen due to its maturity in the ecosystem and its support for serializing and deserializing CBOR via the serde_cbor crate.
As outlined in Appendix E. of the standard, only the JSON data model subset of CBOR can be used for validation. The limited prelude from the spec has been included below for brevity:
any = #
uint = #0
nint = #1
int = uint / nint
tstr = #3
text = tstr
number = int / float
float16 = #7.25
float32 = #7.26
float64 = #7.27
float16-32 = float16 / float32
float32-64 = float32 / float64
float = float16-32 / float64
false = #7.20
true = #7.21
bool = false / true
nil = #7.22
null = nil
Furthermore, the following data types from the standard prelude can be used to validate JSON strings:
tdate = #6.0(tstr)
uri = #6.32(tstr)
The first non-group rule defined by a CDDL data structure definition determines the root type, which is subsequently used for validating the top-level JSON data type.
The following types and features of CDDL are supported by this crate for validating JSON:
CDDL | JSON |
---|---|
structs | objects |
arrays | arrays1 |
text / tstr | string |
number / int / float | number2 |
bool / true / false | boolean |
null / nil | null |
any | any valid JSON |
Since JSON objects only support keys whose types are JSON strings, when validating JSON member keys defined in CDDL structs must use either the colon syntax (mykey: tstr
) or the double arrow syntax with double quotes ("mykey" => tstr
). Unquoted member keys used with the double arrow syntax that resolve to types must resolve to one of the supported data types that can be used to validate JSON strings (text
or tstr
). Occurrence indicators can be used to validate key/value pairs in a JSON object and the number of elements in a JSON array; depending on how the indicators are defined in a CDDL data definition. CDDL groups, generics, sockets/plugs and group-to-choice enumerations are all parsed and monomorphized into their full representations before being evaluated for JSON validation.
Below is the table of supported control operators and whether or not they've been implemented as of the current release:
Control operator | Implementation status |
---|---|
.pcre |
✔️3 |
.regex |
✔️3 (alias for .pcre ) |
.size |
Incomplete |
.bits |
Unsupported for JSON validation |
.cbor |
Unsupported for JSON validation |
.cborseq |
Unsupported for JSON validation |
.within |
Incomplete |
.and |
Incomplete |
.lt |
✔️ |
.le |
✔️ |
.gt |
✔️ |
.ge |
✔️ |
.eq |
Partial (text and numeric values) |
.ne |
Incomplete |
.default |
Incomplete |
1: When groups are used to validate arrays, group entries with occurrence indicators are ignored due to complexities involved with processing these ambiguities. For proper JSON validation, avoid writing CDDL that looks like the following: [ * a: int, b: tstr, ? c: int ]
.
2: While JSON itself does not distinguish between integers and floating-point numbers, this crate does provide the ability to validate numbers against a more specific numerical CBOR type, provided that its equivalent representation is allowed by JSON.
3: Due to Perl-Compatible Regular Expressions (PCREs) being more widely used than XSD regular expressions, this crate also provides support for the proposed .pcre
control extension in place of the .regexp
operator (see Discussion and CDDL-Freezer proposal). Ensure that your regex string is properly JSON escaped when using this control.
CDDL, JSON schema and JSON schema language can all be used to define JSON data structures. However, the approaches taken to develop each of these are vastly different. A good place to find past discussions on the differences between these formats is the IETF mail archive, specifically in the JSON and CBOR lists. The purpose of this crate is not to argue for the use of CDDL over any one of these formats, but simply to provide an example implementation in Rust.
Incomplete. Under development. Less complete than JSON validation functions.
This crate also uses Serde and serde_cbor for validating CBOR data structures. Similary to the JSON validation implementation, CBOR validation is done via the loosely typed serde_cbor::Value
enum.
Only the lexer and parser can be used in a no_std
context provided that a heap allocator is available. This can be enabled by opting out of the default features in your Cargo.toml
file as follows:
[dependencies]
cddl = { version = "<version>", default-features = false }
Zero-copy parsing is implemented to the extent that is possible. Allocation is required for error handling and diagnostics.
Both JSON and CBOR validation are dependent on their respective heap allocated Value
types, but since these types aren't supported in a no_std
context, they subsequently aren't supported by this crate in no_std
.
Below are some known projects that leverage this crate: