-
Notifications
You must be signed in to change notification settings - Fork 44
Core Language Overview
Simplicity is a first-order, typed, purely-functional, non-recursive language based on the sequent calculus instead of the lambda calculus. It has been designed for use as a machine language for blockchains, but also has applications to mathematical Circuits.
Simplicity is a typed language.
We write a : A
to mean that a value a
has type A
.
⚠ Be aware that Simplicity expressions only denote functions, and never values.
Types in Simplicity come in three forms.
- The unit type
-
𝟙
(or alternativelyONE
) - A sum (or disjoint union) of two types
A + B
- A product of two types
A × B
The unit type has only one value, the empty tuple
⟨⟩ : 𝟙
A value of a sum type is either a value from the left type, or a value from the right type.
We label which side a value is from using a σ
tag.
When a : A
and b : B
then
σᴸ(a) : A + B
and
σᴿ(b) : A + B
⚠ For the type A + A
, the values σᴸ(a)
and σᴿ(a)
are distinct.
A value of a product type is a pair of values.
When a : A
and b : B
then
⟨a,b⟩ : A × B
All other types in Simplicity are combinations of these three forms of types.
The type for bits, aka the Boolean type, is defined as
𝟚 := 𝟙 + 𝟙
(Alternatively written as TWO
.)
It has exactly two values, which we label as 0
and 1
.
0 := σᴸ⟨⟩ : 𝟚 1 := σᴿ⟨⟩ : 𝟚
By taking repeated products of the Bit Type we can define types for integers of various bit widths.
𝟚¹ := 𝟚 𝟚² := 𝟚¹ × 𝟚¹ 𝟚⁴ := 𝟚² × 𝟚² 𝟚⁸ := 𝟚⁴ × 𝟚⁴ …
and so on for other powers of 2.
(Alternatively written as TWO^2
, TWO^4
, …)
We use hexadecimal notation to denote values of these word types. For example,
6 := ⟨⟨0,1⟩,⟨1,0⟩⟩ : 𝟚⁴ b := ⟨⟨1,0⟩,⟨1,1⟩⟩ : 𝟚⁴ cafe := ⟨⟨c,a⟩,⟨f,e⟩⟩ : 𝟚¹⁶
We can turn a type into an optional type by summing it with a unit type. This can be used to indicate an optional parameter of a function or for adding a failure case to the result type of a function.
S A := 𝟙 + A
We allow special notation for values of these option types.
∅ := σᴸ⟨⟩ : S A
and when a : A
then
η(a) := σᴿ(a) : S A
All types in Simplicity are finite.
The notation for types can be used to count the number of different values that a type has.
For example, the type 𝟚 + 𝟚⁴
has 18 different values in total.
⚠ Be aware that for types, parentheses matter.
The type (𝟚 + 𝟚) + 𝟚
and 𝟚 + (𝟚 + 𝟚)
are distinct, even though both types contain 6 values.
You may have noticed that the various values in Simplicity are hard to type on your keyboard. This is okay because values are never directly used in Simplicity. Simplicity expressions always denote functions.
We write e : A ⊢ B
(or alternatively e : A |- B
) to mean that an expression e
denotes a function from type A
to type B
.
⚠ Below we write the semantics of expressions as ⟦e⟧(a) := b
for values a : A
and b : B
.
However, this is only used to define the meaning of expressions.
The notation ⟦e⟧(a)
is not part of the Simplicity language.
Again, only expressions themselves make up the Simplicity language.
There are two basic expressions in Simplicity.
iden : A ⊢ A
The identity expression is a function that returns its argument.
⟦iden⟧(a) := a
unit : A ⊢ 𝟙
The unit expression is a function that ignores its argument and returns the unique value of the unit type.
⟦unit⟧(a) := ⟨⟩
There rest of the core Simplicity expression language is formed from combinators, which takes subexpressions and combine them in to new expressions.
We define the type rules for these combinators below, separating the typing requirements of their subexpression with a horizontal line.
s : A ⊢ B t : B ⊢ C ---------------- comp s t : A ⊢ C
The composition combinator pipes the output of one subexpession into another.
⟦comp s t⟧(a) := ⟦t⟧(⟦s⟧(a))
(Alternatively written as s;t
or s >>> t
.)
s : A ⊢ B t : A ⊢ C -------------------- pair s t : A ⊢ B × C
The pair combinator runs two subexpressions on the same input, combining their results into a pair.
⟦pair s t⟧(a) := ⟨⟦s⟧(a),⟦t⟧(a)⟩
(Alternatively written as s ▵ t
or s &&& t
.)
s : A × C ⊢ D t : B × C ⊢ D -------------------------- case s t : (A + B) × C ⊢ D
The case combinator runs one of two subexpressions depending on whether the value in the first component of a pair is tagged left or right.
⟦case s t⟧⟨σᴸ(a), c⟩ := ⟦s⟧⟨a, c⟩ ⟦case s t⟧⟨σᴿ(b), c⟩ := ⟦t⟧⟨b, c⟩
(In the Haskell implementation match
is used in place of case
because case
is a reserved word in Haskell.)
The C
type provides “context” for the case expression that is data to be shared with both branches.
s : A ⊢ C ------------------ take s : A × B ⊢ C
s : B ⊢ C ------------------ drop s : A × B ⊢ C
These combinators take one of the two values from a pair and runs the subexpression on it.
⟦take s⟧⟨a, b⟩ := ⟦s⟧(a) ⟦drop s⟧⟨a, b⟩ := ⟦s⟧(b)
s : A ⊢ B ------------------ injl s : A ⊢ B + C
s : A ⊢ C ------------------ injr s : A ⊢ B + C
These combinators run their subexpression and wrap its result in either a left tag or a right tag.
⟦injl s⟧(a) := σᴸ(⟦s⟧(a)) ⟦injr s⟧(a) := σᴿ(⟦s⟧(a))
Simplicity expressions always denote functions of a single input type. If you wish to define a function of multiple arguments, it is necessary to form a product of all those argument types.
When working with such a function, you will typically use combinations of take
and drop
to extract those arguments from a nested tuple.
For that reason we define special notation for this sort of composition
H := iden OH := take iden IH := drop iden OOH := take (take iden) OIH := take (drop iden) …
This notation is designed to resemble a binary number indicating the argument’s position. While not a perfect analogy, these behave much like De Bruijn indices.
As mentioned before, the Simplicity expression language only denotes functions and not values.
However, it is possible to use values in Simplicity by treating them as functions from the 𝟙
type.
We use scribe
to express such functions.
scribe⟨⟩ := unit : 𝟙 ⊢ 𝟙 scribe(σᴸ(a)) := injl scribe(a) : 𝟙 ⊢ A + B scribe(σᴿ(b)) := injr scribe(b) : 𝟙 ⊢ A + B scribe⟨a,b⟩ := scribe(a) ▵ scribe(b) : 𝟙 ⊢ A × B
Theorem:
⟦scribe(a)⟧⟨⟩ = a
We define specialization of the case
expression applied to Bits.
s : C ⊢ D t : C ⊢ D ---------------------------------------------- cond s t := case (drop t) (drop s) : 𝟚 × C ⊢ D
⚠ The arguments to cond
are the reverse of those to case
.
Typically in programming languages, we expect the “then” branch to come before the “else” branch in a conditional, whereas the case
combinator expects the branch for “left” values before the branch for “right” values.
⚠ The following examples are for illustration purposes, and are not necessarily part of any standard library of Simplicity expressions as written.
The following expression uses cond
to negate a bit.
Because there is no “context” for this conditional, we have to explicitly add an empty context of type 𝟙
.
not := (iden &&& unit) >>> (cond scribe(0) scribe(1)) : 𝟚 ⊢ 𝟚
The following expression uses a bit to select between two values of the same type.
ch := cond OH IH : 𝟚 × (A × A) ⊢ A
The following expression defines the xor
of two bits.
xor := cond not iden : 𝟚 × 𝟚 ⊢ 𝟚
Given three bits, return true if and only if the majority of the inputs are true.
maj := cond (cond (unit >>> scribe(1)) iden) (cond iden (unit >>> scribe(0))) : 𝟚 × (𝟚 × 𝟚) ⊢ 𝟚
While, in principle, you can write any function just using Simplicity’s basic combinators, In practice we use Jets to provide functions for integer arithmetic and cryptographic operations and the like.
Jets are like language intrinsics. Every jet is formally specified by a Simplicity expression that precisely defines the jet’s intended behaviour.