Skip to content

Commit e4c7ba1

Browse files
Implement control flow specification (#3)
1 parent faaebf7 commit e4c7ba1

File tree

3 files changed

+1447
-53
lines changed

3 files changed

+1447
-53
lines changed

README.md

Lines changed: 187 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -17,64 +17,121 @@ Based on [instructions.csv](https://github.com/ton-community/ton-docs/blob/main/
1717
| doc | ✅ Implemented | Provides human-readable information about instructions. Can be useful to provide integrated docs to user, for example, in disassembler.
1818
| bytecode | ✅ Implemented | Describes instruction encoding. It contains information to determine, which instruction we are currently decoding and how to parse its operands.
1919
| value_flow | ✅ Implemented | Describes how instruction changes current stack. This part of specification allows to analyze how instructions interact with each other, so it becomes possible to implement high-level tools such as decompilers.
20-
| control_flow | ❌ Not implemented | Describes code flow. It helps to reconstruct a control flow graph. This part mainly contains semantics of cont_* category instructions. For example, both JMPX and CALLX transfers execution to continuation on stack, but only CALLX returns and JMPX is not.
20+
| control_flow | ✅ Implemented | Describes code flow (operations with cc register). It helps to reconstruct a control flow graph. This part mainly contains semantics of cont_* category instructions. For example, both JMPX and CALLX transfers execution to continuation on stack, but only CALLX returns and JMPX is not.
2121
| aliases | ✅ Implemented | Specifies instruction aliases. Can be used to provide to user information about special cases (for example, SWAP is a special case of XCHG_0i with i = 1).
2222

23+
## Usage
24+
Convenient way is to add submodule to your tool. This will greatly simplify debugging and upgrading process.
25+
```bash
26+
git submodule add https://github.com/hacker-volodya/tvm-spec
27+
```
28+
However, nothing can stop you from just copying `cp0.json` (and `schema.json` if necessary).
29+
2330
## Projects based on tvm-spec
2431
1. [tvm-spec-example](https://github.com/hacker-volodya/tvm-spec-example), tiny TVM disassembler
32+
2. [tvm-research](https://github.com/hacker-volodya/tvm-research), collection of tool prototypes with the power of tvm-spec
2533

2634
## Instruction Specification
2735
### Example
2836
```json
29-
{
30-
"mnemonic": "LDU",
31-
"doc": {
32-
"category": "cell_parse",
33-
"description": "Loads an unsigned `cc+1`-bit integer `x` from _Slice_ `s`.",
34-
"gas": "26",
35-
"fift": "[cc+1] LDU"
36-
},
37-
"bytecode": {
38-
"doc_opcode": "D3cc",
39-
"tlb": "#D3 cc:uint8",
40-
"prefix": "D3",
41-
"operands": [
42-
{
43-
"name": "c",
44-
"loader": "uint",
45-
"loader_args": {
46-
"size": 8
47-
}
48-
}
49-
]
50-
},
51-
"value_flow": {
52-
"doc_stack": "s - x s'",
53-
"inputs": {
54-
"stack": [
37+
[
38+
{
39+
"mnemonic": "LDU",
40+
"doc": {
41+
"category": "cell_parse",
42+
"description": "Loads an unsigned `cc+1`-bit integer `x` from _Slice_ `s`.",
43+
"gas": "26",
44+
"fift": "[cc+1] LDU"
45+
},
46+
"bytecode": {
47+
"doc_opcode": "D3cc",
48+
"tlb": "#D3 cc:uint8",
49+
"prefix": "D3",
50+
"operands": [
5551
{
56-
"type": "simple",
57-
"name": "s",
58-
"value_types": ["Slice"]
52+
"name": "c",
53+
"loader": "uint",
54+
"loader_args": {
55+
"size": 8
56+
}
5957
}
6058
]
6159
},
62-
"outputs": {
63-
"stack": [
64-
{
65-
"type": "simple",
66-
"name": "x",
67-
"value_types": ["Integer"]
68-
},
60+
"value_flow": {
61+
"doc_stack": "s - x s'",
62+
"inputs": {
63+
"stack": [
64+
{
65+
"type": "simple",
66+
"name": "s",
67+
"value_types": ["Slice"]
68+
}
69+
]
70+
},
71+
"outputs": {
72+
"stack": [
73+
{
74+
"type": "simple",
75+
"name": "x",
76+
"value_types": ["Integer"]
77+
},
78+
{
79+
"type": "simple",
80+
"name": "s2",
81+
"value_types": ["Slice"]
82+
}
83+
]
84+
}
85+
}
86+
},
87+
{
88+
"mnemonic": "EXECUTE",
89+
"doc": {
90+
"category": "cont_basic",
91+
"description": "_Calls_, or _executes_, continuation `c`.",
92+
"gas": "18",
93+
"fift": "EXECUTE\nCALLX"
94+
},
95+
"bytecode": {
96+
"doc_opcode": "D8",
97+
"tlb": "#D8",
98+
"prefix": "D8",
99+
"operands": []
100+
},
101+
"value_flow": {
102+
"doc_stack": "c - ",
103+
"inputs": {
104+
"stack": [
105+
{
106+
"type": "simple",
107+
"name": "c",
108+
"value_types": ["Continuation"]
109+
}
110+
]
111+
},
112+
"outputs": {
113+
"stack": [
114+
]
115+
}
116+
},
117+
"control_flow": {
118+
"branches": [
69119
{
70-
"type": "simple",
71-
"name": "s2",
72-
"value_types": ["Slice"]
120+
"type": "variable",
121+
"var_name": "c",
122+
"save": {
123+
"c0": {
124+
"type": "cc",
125+
"save": {
126+
"c0": { "type": "register", "index": 0 }
127+
}
128+
}
129+
}
73130
}
74131
]
75132
}
76133
}
77-
}
134+
]
78135
```
79136

80137
### Documentation
@@ -104,6 +161,9 @@ Based on [instructions.csv](https://github.com/ton-community/ton-docs/blob/main/
104161
| value_flow.inputs.stack[i].type | Type of stack entry. Can be one of "simple", "const", "conditional", "array". Required.
105162
| value_flow.inputs.stack[i].* | Properties for stack entries of each type are described below.
106163
| value_flow.outputs | Outgoing values constraints. Output is unconstrained if absent. Identical to value_flow.inputs.
164+
| control_flow | Information related to current cc modification by instruction. Optional.
165+
| control_flow.branches | Array of possible branches of an instruction. Specifies all possible values of cc after instruction execution. Empty by default (no branches can be taken by instruction). Each branch described by a `Continuation` object described below.
166+
| control_flow.nobranch | Can this instruction not perform any of specified branches in certain cases (do not modify cc)? False by default.
107167

108168
### Loaders Specification and Examples
109169
#### uint
@@ -290,9 +350,94 @@ _Stack notation: `pk_1 msg_1 ... pk_n msg_n`_
290350
Specifies a bunch of stack entries with length from variable `length_var`, usually noted as `x_1 ... x_n`. Each part of array, such as `x_i` or `x_i y_i` is described in `array_entry`. Used in tuple, continuation arguments and crypto operations.
291351

292352
#### Notes
293-
1. Each variable name is unique across `operands` and `stack` sections of each instruction. Assumed that variables are immutable, so if variable `x` is defined both in inputs and outputs, it goes to output without any modification.
353+
1. Each variable name is unique across `operands`, `value_flow`, and `control_flow` sections of each instruction. Assumed that variables are immutable, so if variable `x` is defined both in inputs and outputs, it goes to output without any modification.
294354
2. Value flow describes only `cc` stack usage before the actual jump or call. Subsequent continuations may have a separate stack, so this will be defined in control flow section of this spec.
295355

356+
### Continuations Specification and Examples
357+
Each object represents a continuation, which can be constructed in several ways:
358+
* from a variable (operand or stack)
359+
* from cc register
360+
* from c0-c3 registers
361+
* by "extraordinary continuation" constructors (as in [tvm.pdf](https://docs.ton.org/tvm.pdf), p.50: "4.1.5. Extraordinary continuations.")
362+
363+
Savelist can be defined using `save` property. Keys are `c0`, `c1`, `c2`, `c3` and values are continuation objects (in fact, continuation is a recursive type, representing a tree of continuations). Please note that savelist defined here will not override already saved continuation registers (that's the standard behaviour of TVM when saving registers).
364+
365+
#### variable
366+
```json
367+
{
368+
"type": "variable",
369+
"var_name": "c",
370+
"save": {
371+
"c0": {
372+
"type": "cc",
373+
"save": {
374+
"c0": { "type": "register", "index": 0 }
375+
}
376+
}
377+
}
378+
}
379+
```
380+
Specifies a variable-backed continuation. Variable may be not referenced previously (in `operands` or `value_flow` section), assuming that it is defined somewhere in the actual implementation of the instruction. Example of such instruction is `DICTIGETJMP`: internally it does dictionary lookup for continuation retrieval, so `x` in control flow is expected to be defined in the implementation of `DICTIGETJMP`.
381+
382+
#### cc
383+
```json
384+
{
385+
"type": "cc",
386+
"save": {
387+
"c0": { "type": "register", "index": 0 }
388+
}
389+
}
390+
```
391+
Specifies a continuation, which is constructed from cc code and `save` registers. In C++ implementation this operation is implemented in [VmState::extract_cc](https://github.com/ton-blockchain/ton/blob/17c3477f7191fe6e5db22b71631b5c7472046c2f/crypto/vm/vm.cpp#L356).
392+
393+
#### register
394+
```json
395+
{
396+
"type": "register",
397+
"index": 3,
398+
"save": {
399+
"c0": {
400+
"type": "cc",
401+
"save": {
402+
"c0": { "type": "register", "index": 0 }
403+
}
404+
}
405+
}
406+
}
407+
```
408+
Specifies continuation from register with `index` (`"index": 3` => take continuation from `c3`).
409+
410+
#### special
411+
```json
412+
{
413+
"type": "special",
414+
"name": "repeat",
415+
"args": {
416+
"count": "n",
417+
"body": {
418+
"type": "variable",
419+
"var_name": "c"
420+
},
421+
"after": {
422+
"type": "cc",
423+
"save": {
424+
"c0": { "type": "register", "index": 0 }
425+
}
426+
}
427+
}
428+
}
429+
```
430+
431+
Specifies extraordinary continuation. `save` property does not apply here.
432+
433+
| name | args | description
434+
| ---- | ---- | -----------
435+
| repeat | string count; continuation body, after | Count is the name of a count variable. Jumps to `body` with `c0 = repeat(count - 1, body, after)`, jumps to `after` if `count <= 0`.
436+
| until | continuation body, after | Pops boolean from stack, jumps to `after` if it is `true`, otherwise jumps to `body` with `c0 = until(body, after)`.
437+
| while | continuation cond, body, after | Pops boolean from stack, jumps to `after` if it is `false`, otherwise jumps to `body` with `c0 = cond` and `cond.c0 = while(cond, body, after)`.
438+
| again | continuation body | Jumps to body with `c0 = again(body)`.
439+
| pushint | continuation next; integer value | Push `value` on stack, jump to `next`.
440+
296441
## Alias Specification
297442
### Example
298443
```json

0 commit comments

Comments
 (0)