Skip to content

Project: DSL Syntax Simplifications and Improvements

SimonCockx edited this page Mar 18, 2022 · 14 revisions

Rosetta has grown over the years: new features have appeared and new kinds of expressions got supported. Looking back, some features could have been implemented in a better way or are unnecessarily complex. To simplify the upcoming work described in the Roadmap, we intend to clean up the syntax of Rosetta in a couple of non-intrusive ways. Furthermore, we introduce a new and explicit way to instantiate data types with a JSON-like syntax.

Improve the implementation of the current syntax of Rosetta

This change should not have a noticeable effect, but it will simplify upcoming improvements to the type system and code generators of Rosetta.

  1. The argument of an only exists operation should always end in a feature call ->.

Currently, it is possible to write an expression such as myVariable only exists, which is not intentional. Arguments of only exists should always end in a feature call such as myVariable -> myProperty only exists. As Simon Cockx discusses in his thesis, this might only look like a small thing, but it would actually decrease the complexity of code generators in a non-trivial way.

  1. The only-element operation can be used in any context.

Currently, the only-element operator is only valid in two locations: after a feature call a -> b only-element and after a function call MyFunc() only-element. This is an unnecessary restriction, introducing unnecessary complexity.

  1. Allow empty list literals [].

Empty lists [] are currently not allowed syntactically. We intend to lift this restriction.

  1. Remove parentheses from the internal abstract syntax tree (AST) of Rosetta.

Parentheses currently are parsed as a separate node in the AST of a Rosetta model. This is redundant: they are only relevant in the concrete syntax of a Rosetta model, and representing them in the AST complicates downstream processing such as type checking and code generation.

Remove redundant AST elements using syntactic sugar

This is only an internal change, and does not change the concrete syntax of Rosetta.

  1. Parse the empty literal as a empty list literal []. (i.e., make them equivalent to downstream processes such as the type checker and code generator)
  2. Parse a conditional expression with an absent else branch as else empty. (i.e., make if condition then value equivalent with if condition then value else empty to downstream processes)

This reduces the number of cases that the type checker and code generators should be able to handle.

Add JSON-like syntax for instantiating data types

Problem: There is currently no way to explicitly instantiate a data type.

Example: Suppose we want to describe an employee of a company.

type Employee:
  name string (1..1)
  salary number (1..1)
  isSeniorMember boolean (1..1)
  mentor Employee (0..1)

Currently, the only way to actually make an instance of this type, is to define a function that returns an Employee. This is quite verbose.

func CreateEmployee:
  inputs:
    name string (1..1)
    salary number (1..1)
    isSeniorMember boolean (1..1)
    mentor Employee (0..1)
  output:
    result Employee (1..1)
  set result->name: name
  set result->salary: salary
  set result->isSeniorMember: isSeniorMember
  set result->mentor: mentor

Now we can instantiate an Employee by calling CreateEmployee(...).

Additionally, as an edge case, it is currently impossible to instantiate a data type without any properties. (which can be useful in some cases)

type Zero:

func CreateZero:
  output:
    result Zero (1..1)
  set result: ??? // impossible to implement this function

Solution: We intend to introduce an intuitive JSON-like syntax to instantiate a data type. For example, we could instantiate an Employee called Dwight Schrute who has a mentor called Michael Scott as follows.

Employee {
  name: "Dwight Schrute",
  salary: 4800.00,
  isSeniorMember: False,
  mentor: Employee {
    name: "Michael Scott",
    isSeniorMember: True,
    salary: 6300.00,
    mentor: empty
  }
}

Note that properties may be written in any order (see the salary and isSeniorMember properties of Michael Scott).

In many Rosetta models such as the CDM, it is common practice to have data types with many optional properties, e.g., with a cardinality constraint (0..1). In such cases, it is inconvenient to set most of them to empty explicitly. We'll therefore add a shorthand, ..., which stand for "assign empty to every other property". An example taken from the CDM:

type ProductIdentification:
	productQualifier productType (0..1)
	primaryAssetData AssetClassEnum (0..1)
	secondaryAssetData AssetClassEnum (0..*)
	externalProductType ExternalProductType (0..*)
	productIdentifier ProductIdentifier (0..*)

We could then write

ProductIdentification {
  primaryAssetData: AssetClassEnum->Credit,
  ...
}

which would be equivalent with

ProductIdentification {
  primaryAssetData: AssetClassEnum->Credit,
  productQualifier: empty,
  secondaryAssetData: empty,
  externalProductType: empty,
  productIdentifier: empty
}