In this section, we explain how to navigate the semantics, why it is verified and how to execute bytecode.
The semantics of opcodes is written in a functional style in bytecode.dfy.
For instance, the function Add
below gives the semantics of opcode ADD
:
/**
* Unsigned integer addition with modulo arithmetic.
*
* @param st A state.
* @returns The state after executing an `ADD` or an `Error` state.
*/
function method Add(st: State): (st': State)
requires st.IsExecuting()
ensures st'.OK? || st' == INVALID(STACK_UNDERFLOW)
ensures st'.OK? <==> st.Operands() >= 2
ensures st'.OK? ==> st'.Operands() == st.Operands() - 1
{
if st.Operands() >= 2
then
var lhs := st.Peek(0) as int;
var rhs := st.Peek(1) as int;
var res := (lhs + rhs) % TWO_256;
st.Pop().Pop().Push(res as u256).Next()
else
State.INVALID(STACK_UNDERFLOW)
}
Given x: State
:
x.OK?
holds whenx
(a state) is of typeOK
(non-failure),x.Operands()
is the size of the stack inx
,x.Peek(k)
is the value of the k+1-th element in the stack ofx
,0
being thetop
,x.Pop()
andx.Push()
respectively applies pop and push tox
,x.Next()
increments the program counter inx
.
The complete semantics of ADD
is given by the body of the function Add
.
Note that it is defined a pure function of type State -> State
with no side effects.
Moreover, some interesting properties of this function are captured by the pre- and postconditions:
- the precondition (
requires
) indicates that we can only apply this function to non-failure states; as a result every time we use theAdd
function in our code, we have to ensure that the statest
satisfiesIsExecuting()
(which is essentially a non-failure state). - the body of the function defines the next state as a composition of functions applied in sequence
Pop(), Pop(), Push(), Next()
. Note that some functions likePop(), Push()
(see below) have preconditions. - the postconditions (
ensures
) specify some properties ofAdd
that are not obvious from the code in its body:- First, the result of
Add
is either anOK
state or anINVALID(STACK_UNDERFLOW)
state. - Second,
Add
succeeds if and only ifst.Operands() >= 2
. - Third, if
Add
succeeds the stack size in the result statest'
is the size inst
minus one.
- First, the result of
This type of specifications is easy to read, and some essential properties (postconditions) can be formally proved. For instance the postcondition st'.OK? <==> st.Operands() >= 2
ensures that
the computation st.Pop().Pop().Push(res as u256).Next()
in the body of Add
does not result in a failure state (see Verifying the Semantics below).
This specification above may be contrasted to existing specifications and implementations of Add
:
The Dafny specification of Add
above is self-contained and clearly identifies the cases where the Add
function results in an INVALID
state.
On top of that, the preconditions we provide (requires
) must hold every time the function Add
is called. This guarantees that in our semantics defined as function composition, we never apply Add
on a non-executing (INVALID
, RETURNS
, etc) state.
Apart from clarity and simplicity, the Dafny-EVM semantics provides a high degree of security.
To illustrate this point, it is useful to look at the code of Pop
and Push
:
/**
* Pop word from stack.
*/
function method Pop(st: State): (st': State)
requires st.IsExecuting()
ensures st'.OK? || st' == INVALID(STACK_UNDERFLOW)
ensures st'.OK? <==> st.Operands() >= 1
ensures st'.OK? ==> st'.Operands() == st.Operands() - 1
{
//
if st.Operands() >= 1
then
st.Pop().Next()
else
State.INVALID(STACK_UNDERFLOW)
}
/**
* Push word onto stack.
*/
function method Push(st: State, k: nat): (st': State)
requires k > 0 && k <= 32
requires st.IsExecuting()
ensures st'.OK? || st' == INVALID(STACK_OVERFLOW)
ensures st'.OK? <==> st.Capacity() >= 1
ensures st'.OK? ==> st'.Operands() == st.Operands() + 1
{
if st.Capacity() >= 1
then
var bytes := Code.Slice(st.evm.code, (st.evm.pc+1), k);
assert 0 < |bytes| <= 32;
var k := Bytes.ConvertBytesTo256(bytes);
st.Push(k).Skip(|bytes|+1)
else
State.INVALID(STACK_OVERFLOW)
}
The definitions of Pop
and Push
requires as preconditions a non-failure state (satisfying IsExecuting()
), and some constraints on the stack size or capacity.
In the EVM the stack has a maximum size of 1024, and if the current size is k
the capacity is 1024 - k
.
What roles do preconditions play and how do they help?
In the definition of Add
above, we define the next state by st.Pop().Pop().Push(...).Next()
where st
is the current state.
Using a verification-friendly language like Dafny triggers several checks at compile time and among them:
every time we call a function f
, the preconditions of f
must hold.
This is not a runtime check that is going to be performed on each call of f
but a compile-time check that must provably always hold.
This is part of what Dafny does when it verifies the Add
function: it checks that:
st
has more than one operand and the call tost.Pop()
satisfies the preconditions ofPop
;- it also checks that
st.Pop()
satisfies the preconditions ofPop
as we applyPop
to the previous result. This is the reason why we have theif st.Operands() >= 2
in the definition ofAdd
and this is what enables Dafny to prove that the preconditions of the second call toPop
are satisfied. If you download the code and have Dafny installed, you may replaceif st.Operands() >= 2
byif st.Operands() >= 1
inAdd
. Dafny will highlight the error by pointing to the linevar rhs := st.Peek(1) as int
indicating that the precondition ofPeek(1)
does not hold (Peek(1)
also requires a stack size larger than 1.) - another check is performed on
st.Pop().Pop()
to ensure that stack capacity satisfies the preconditions ofPush
. The proof of this fact follows from the semantics ofPop
that removes elements from the stack; so the capcity of the stack afterst.Pop().Pop()
is at least one (which is worst-case of the stack capacity isst
is zero).
All the previous checks provide a very high degree of security of the code, ensuring that we never call by mistake Add
on a stack with one element.
The verification-friendliness of Dafny does not stop at preconditions, but also enforces the following checks by default:
- type checks: if a variable
v
is declared with type sayu8
, any assignment tov
requires a proof that0 <= v <= 255
. The same holds for more complex types. For instance the type of the stack is aseq<u256>
of size<= 1024
. This means that every time we compute a state there must be a compile-time proof that the stack size is<= 1024
. Thankfully Dafny can automatically prove this most of the time with out help from the programmer. - bound checks: we use unbounded lists,
seq<.>
in Dafny to represent several components (e.g. the stack). Any accesss[i]
to an element at a given indexi
in aseq<.>
s
must be provably valid, i.e.0 <= i < size(s)
. This rules out any programming errorindex-out-of-bound
in our code base. - termination checks: every loop or recursive function must provably terminate. Dafny checks termination for every piece of code, and this may require the programmer to add
ranking functions
.
By using a verification-friendly language, we can provide machine-checkable guarantees about our code.
An EVM is in a given state s: State
. The effect of an opcode op
on the current state
is described in two separate functions OpSem
and OpGas
in evm.dfy:
/** The semantics of opcodes.
*
* @param op The opcode to look up.
* @param s The state to apply the opcode to.
* @returns The new state obtained after applying the semantics
* of the opcode.
* @note If an opcode is not supported, or there is not enough gas
* the returned state is INVALID.
*/
function method OpSem(op: u8, s: State): State
/** The gas cost semantics of an opcode.
*
* @param op The opcode to look up.
* @param s A state.
* @returns The new state obtained having consumed the gas that corresponds to
* the cost of `opcode` is `s`.
*/
function method OpGas(op: u8, s: State): State
OpGas
charges some gas to perform the computation associated to an opcode op
in a given state s
.
Most of the time the cost is fixed but sometimes the cost depends on the state, for instance if there is memory expansion.
OpSem
defines the semantics of opcodes but does not take into account the gas consumption.
For instance, the result of OpSem(ADD, s)
is a new state s'
which is equal to s
except for the stack.
OpSem
uses the functions defined in bytecode.dfy to compute the new state.
For instance a concrete definition of OpSem
in module EvmBerlin defines the semantics of the ADD
opcode as follows:
function method OpSem(op: u8, s: State): State {
match s
case OK(st) =>
(
match op
...
case ADD => Bytecode.Add(s)
...
)
}
where Add
is the function that gives the semantics of opcode ADD
.
Overall the new state of the EVM after executing an opcode op
in state s
is s' = OpSem(opcode, OpGas(opcode, s))
.
The opcode to execute is part of the state s
as s
has the code to execute and the program counter.
The Execute
function below determines the next state: it extracts the opcode from the state st
; if the opcode is valid, it computes the new state according to the semantics of the opcode, and otherwise we end up in an INVALID
state.
/**
* Execute the next instruction.
* return
* @note If the opcode semantics/gas is not implemented, the next
* state is INVALID.
*/
function method Execute(st: State): State
{
match st.OpDecode()
case Some(opcode) => OpSem(opcode, OpGas(opcode, st))
case None => State.INVALID(INVALID_OPCODE)
}
Executing some bytecode can be achieved by creating a fresh EVM, and then applying the function Execute
a number of times.
In practice we first have to define a concrete module with concrete functions OpSem
and OpGas
.
The module EvmBerlin is an example of such a concrete module.
Once we have a concrete module, we can execute some bytecode.
The example below is taken from test.dfy.
method Test_EVM_01(x: u8)
{
// Bytecode to execute
var vm := EvmBerlin.InitEmpty(
gas := 300,
code := [
PUSH1, x,
PUSH1, 0x0,
MSTORE,
PUSH1, 0x1,
PUSH1, 0x1F,
RETURN
]);
// PUSH1 x
vm := EvmBerlin.Execute(vm);
// PUSH1 0x0
vm := EvmBerlin.Execute(vm);
// MSTORE
vm := EvmBerlin.Execute(vm);
// PUSH ... RETURN -- Execute 3 steps
vm := EvmBerlin.ExecuteN(vm,3);
//
assert vm.RETURNS?;
}
Several examples can be found in this test folder.