forked from awesome-doge/TON_Paper
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathen_tblkch.tex
2327 lines (1886 loc) · 270 KB
/
en_tblkch.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[12pt,oneside]{article}
\usepackage[T1]{fontenc}
%\usepackage{euler}
\usepackage{amssymb, amsmath, amsfonts, stmaryrd}
\usepackage[mathscr]{euscript}
\usepackage{mathrsfs}
\usepackage{theorem}
\usepackage[english]{babel}
\usepackage{bm}
\usepackage[all]{xy}
\usepackage{array}
\usepackage{multirow}
%\usepackage{chngcntr}
%\CompileMatrices
\usepackage[bookmarks=false,pdfauthor={Nikolai Durov},pdftitle={Telegram Open Network Blockchain}]{hyperref}
\usepackage{fancyhdr}
\usepackage{caption}
%
\setlength{\headheight}{15.2pt}
\pagestyle{fancy}
\renewcommand{\headrulewidth}{0.5pt}
%
\def\makepoint#1{\medbreak\noindent{\bf #1.\ }}
\def\zeropoint{\setcounter{subsection}{-1}}
\def\zerosubpoint{\setcounter{subsubsection}{-1}}
\def\nxpoint{\refstepcounter{subsection}%
\smallbreak\makepoint{\thesubsection}}
\def\nxsubpoint{\refstepcounter{subsubsection}%
\smallbreak\makepoint{\thesubsubsection}}
\def\nxsubsubpoint{\refstepcounter{paragraph}%
\makepoint{\paragraph}}
%\setcounter{secnumdepth}{4}
%\counterwithin{paragraph}{subsubsection}
\def\refpoint#1{{\rm\textbf{\ref{#1}}}}
\let\ptref=\refpoint
\def\embt(#1.){\textbf{#1.}}
\def\embtx(#1){\textbf{#1}}
\def\emb#1{\textbf{#1.}}
\long\def\nodo#1{}
%
%\def\markbothsame#1{\markboth{#1}{#1}}
\fancyhf{}
\fancyfoot[C]{\thepage}
\def\markbothsame#1{\fancyhead[C]{#1}}
\def\mysection#1{\section{#1}\fancyhead[C]{\textsc{Chapter \textbf{\thesection.} #1}}}
\def\mysubsection#1{\subsection{#1}\fancyhead[C]{\small{\textsc{\textrm{\thesubsection.} #1}}}}
\def\myappendix#1{\section{#1}\fancyhead[C]{\textsc{Appendix \textbf{\thesection.} #1}}}
%
\let\tp=\textit
\let\vr=\textit
\def\workchainid{\vr{workchain\_id\/}}
\def\shardpfx{\vr{shard\_prefix}}
\def\accountid{\vr{account\_id\/}}
\def\currencyid{\vr{currency\_id\/}}
\def\uint{\tp{uint}}
\def\opsc#1{\operatorname{\textsc{#1}}}
\def\CellRepr{\opsc{CellRepr}}
\def\blkseqno{\opsc{blk-seqno}}
\def\blkprev{\opsc{blk-prev}}
\def\blkhash{\opsc{blk-hash}}
\def\Hash{\opsc{Hash}}
\def\Sha{\opsc{sha256}}
\def\SHA#1{\opsc{sha#1}}
\def\Int{\opsc{int}}
\def\height{\opsc{height}}
\def\len{\opsc{len}}
\def\leaf{\opsc{Leaf}}
\def\node{\opsc{Node}}
\def\Seqno{\opsc{SeqNo}}
\def\LT{\opsc{Lt}}
\def\NextHop{\opsc{NextHop}}
\def\root{\opsc{Root}}
\def\emptyroot{\opsc{EmptyRoot}}
\def\code{\opsc{code}}
\def\Ping{\opsc{Ping}}
\def\Store{\opsc{Store}}
\def\FindNode{\opsc{Find\_Node}}
\def\FindValue{\opsc{Find\_Value}}
\def\Bytes{\tp{Bytes}}
\def\Transaction{\tp{Transaction}}
\def\Account{\tp{Account}}
\def\State{\tp{State}}
\def\Maybe{\opsc{Maybe}}
\def\List{\opsc{List}}
\def\Block{\tp{Block}}
\def\Blockchain{\tp{Blockchain}}
\def\isValidBc{\tp{isValidBc}}
\def\evtrans{\vr{ev\_trans}}
\def\evblock{\vr{ev\_block}}
\def\Hashmap{\tp{Hashmap}}
\def\HashmapE{\tp{HashmapE}}
\def\Type{\tp{Type}}
\def\nat{\tp{nat\/}}
\def\hget{\vr{hget\/}}
\def\bbB{{\mathbb{B}}}
\def\bbP{{\mathbb{P}}}
\def\bbF{{\mathbb{F}}}
\def\bbZ{{\mathbb{Z}}}
\def\st#1{{\mathbf{#1}}}
\def\sgn{\operatorname{sgn}}
\def\charact{\operatorname{char}}
\def\caret{\^{}}
\def\cF{\mathscr{F}}
%
\hfuzz=0.8pt
\title{Telegram Open Network Blockchain}
\author{Nikolai Durov}
\begin{document}
%\pagestyle{myheadings}
\maketitle
\begin{abstract}
The aim of this text is to provide a detailed description of the
Telegram Open Network (TON) Blockchain.
\end{abstract}
\section*{Introduction}
\markbothsame{Introduction}
This document provides a detailed description of the TON Blockchain, including its precise block format, validity conditions, TON Virtual Machine (TVM) invocation details, smart-contract creation process, and cryptographic signatures. In this respect it is a continuation of the TON whitepaper (cf.~\cite{TON}), so we freely use the terminology introduced in that document.
Chapter~\ptref{sect:overview} provides a general overview of the TON Blockchain and its design principles, with particular attention to the introduction of compatibility and validity conditions and the implementation of message delivery guarantees. More detailed information, such as the TL-B schemes that describe the serialization of all required data structures into trees or collections (``bags'') of cells, is provided in subsequent chapters, culminating in a complete description of the TON Blockchain (shardchain and masterchain) block layout in Chapter~\ptref{sect:block.layout}.
A detailed description of the elliptic curve cryptography used for signing blocks and messages, also accessible through TVM primitives, is provided in Appendix~\ptref{app:ecc}. TVM itself is described in a separate document (cf.~\cite{TVM}).
Some subjects have intentionally been left out of this document. One is the Byzantine Fault Tolerant (BFT) protocol used by the validators to determine the next block of the masterchain or a shardchain; that subject is left for a forthcoming document dedicated to the TON Network. And although this document describes the precise format of TON Blockchain blocks, and discusses the blockchain's validity conditions and serialized invalidity proofs,\footnote{As of August 2018, this document does not include a detailed description of serialized invalidity proofs, because they are likely to change significantly during the development of the validator software. Only the general design principles for consistency conditions and serialized invalidity proofs are discussed.} it provides no details about the network protocols used to propagate these blocks, block candidates, collated blocks, and invalidity proofs.
Similarly, this document does not provide the complete source code of the masterchain smart contracts used to elect the validators, change the configurable parameters or get their current values, or punish the validators for their misbehavior, even though these smart contracts form an important part of the total blockchain state and of the masterchain block zero. Instead, this document describes the location of these smart contracts and their formal interfaces.\footnote{This is not included in the present version of this document, but will be provided in a separate appendix to a future revision.} The source code of these smart contracts will be provided separately as downloadable files with comments.
Please note that the current version of this document describes a preliminary test version of the TON Blockchain; some minor details are likely to change prior to launch during the development, testing, and deployment phases.
\clearpage
\tableofcontents
\clearpage
\mysection{Overview}\label{sect:overview}
This chapter provides an overview of the main features and design principles of the TON Blockchain. More detail on each topic is provided in subsequent chapters.
\mysubsection{Everything is a bag of cells}
All data in the blocks and state of the TON Blockchain is represented as a collection of {\em cells\/} (cf.~\cite[2.5]{TON}). Therefore, this chapter begins with a general discussion of cells.
\nxsubpoint\emb{TVM cells}
Recall that the TON Blockchain, as well as the TON Virtual Machine (TVM; cf.~\cite{TVM}), represents all permanently stored data as a {\em collection\/} or {\em bag\/} of so-called {\em cells}. Each cell consists of up to 1023 data bits and up to four references to other cells. Cyclic cell references are not allowed, so the cells are usually organized into {\em trees of cells\/}, or rather {\em directed acyclic graphs (DAGs) of cells}.\footnote{Completely identical cells are often identified in memory and in disk storage; this is the reason why trees of cells are transparently transformed into DAGs of cells. From this perspective, a DAG is just a storage optimization of the underlying tree of cells, irrelevant for most considerations.} Any value of an abstract algebraic (dependent) data type may be represented (serialized) as a tree of cells. The precise way of representing values of an abstract data type as a tree of cells is expressed by means of a {\em TL-B scheme\/}.\footnote{Cf.~\cite[3.3.3--4]{TVM}, where an example is given and explained, pending a more complete reference} A more thorough discussion of different kinds of cells may be found in~\cite[3.1]{TVM}.
\nxsubpoint\emb{Application to TON Blockchain blocks and state}
The above is particularly applicable to the blocks and state of the TON Blockchain, which also are values of certain (quite convoluted) dependent algebraic data types. Therefore, they are serialized according to various TL-B schemes (which are gradually presented throughout this document), and are represented as a collection or bag of cells.
\nxsubpoint\label{sp:data.cell.layout}\emb{The layout of a single cell}
Each single cell consists of up to 1023 data bits and up to four references to other cells. When a cell is kept in memory, its exact representation is implementation-dependent. However, there is a standard representation of cells, useful, for instance, for serializing cells for file storage or network transmission. This ``standard representation'' or ``standard layout'' $\CellRepr(c)$ of a cell $c$ consists of the following:
\begin{itemize}
\item Two {\em descriptor bytes} come first, sometimes denoted by $d_1$ and $d_2$. The first of these bytes $d_1$ equals (in the simplest case) the number of references $0\leq r\leq 4$ in the cell. The second descriptor byte $d_2$ encodes the bit length $l$ of the data part of the cell as follows: the first seven bits of $d_2$ equal $\lfloor l/8\rfloor$, the number of complete data bytes present in the cell, while the last bit of $d_2$ is the {\em completion tag}, equal to one if $l$ is not divisible by eight. Therefore,
\begin{equation}
d_2=2\lfloor l/8\rfloor+[l\bmod 8\neq0]=\lfloor l/8\rfloor+\lceil l/8\rceil
\end{equation}
where $[A]$ equals one when condition $A$ is true, and zero otherwise.
\item Next, $\lceil l/8\rceil$ data bytes follow. This means that the $l$ data bits of the cell are split into groups of eight, and each group is interpreted as a big-endian 8-bit integer and stored into a byte. If $l$ is not divisible by eight, a single binary one and a suitable number of binary zeroes (up to six) are appended to the data bits, and the completion tag (the least significant bit of the descriptor byte $d_2$) is set.
\item Finally, $r$ references to other cells follow. Each reference is normally represented by 32 bytes containing the $\Sha$ hash of the referenced cell, computed as explained below in~\ptref{sp:sha.cell.hash}.
\end{itemize}
In this way, the standard representation $\CellRepr(c)$ of a cell $c$ with $l$ data bits and $r$ references is $2+\lfloor l/8\rfloor+\lceil l/8\rceil+32r$ bytes long.
\nxsubpoint\label{sp:sha.cell.hash}\emb{The $\Sha$ hash of a cell}
The $\Sha$ hash of a cell $c$ is recursively defined as the $\Sha$ of the standard representation $\CellRepr(c)$ of the cell in question:
\begin{equation}
\Hash(c):=\Sha(c):=\Sha\bigl(\CellRepr(c)\bigr)
\end{equation}
Because cyclic cell references are not allowed (the relationships among all cells must constitute a directed acyclic graph, or DAG), the $\Sha$ hash of a cell is always well-defined.
Furthermore, because $\Sha$ is tacitly assumed to be collision-resistant, we assume that all the cells that we encounter are completely determined by their hashes. In particular, the cell references of a cell $c$ are completely determined by the hashes of the referenced cells, contained in the standard representation $\CellRepr(c)$.
\nxsubpoint\emb{Exotic cells}
Apart from the {\em ordinary\/} cells (also called {\em simple\/} or {\em data\/} cells) considered so far, cells of other types, called {\em exotic cells}, sometimes appear in the actual representations of TON Blockchain blocks and other data structures. Their representation is somewhat different; they are distinguished by having the first descriptor byte $d_1\geq 5$ (cf.~\cite[3.1]{TVM}).
\nxsubpoint\emb{External reference cells}
{\em (External) reference cells}, which contain the 32-byte $\Sha(c)$ of a ``true'' data cell $c$ instead of the data cell itself, are one example of exotic cells. These cells can be used in the serialization of a bag of cells corresponding to a TON Blockchain block in order to refer to data cells absent in the serialization of the block itself, but assumed to be present somewhere else (e.g., in the previous state of the blockchain).
\nxsubpoint\emb{Transparency of reference cells with respect to most operations}
Most cell operations do not observe any reference cells or other ``exotic'' kinds of cells; they see only data cells, with any reference cell transparently replaced by the cell referred to. For example, when the {\em transparent\/} cell hash $\Hash^\flat(c)$ is recursively computed, the hash of a reference cell is set to be equal to the hash of the cell referred to, not the hash of the standard representation of the reference cell.
\nxsubpoint\emb{Transparent hash and representation hash of a cell}
In this way, $\Sha^\flat(c)=\Hash^\flat(c)$ is the {\em transparent hash} of a cell $c$ (or the tree of cells rooted in $c$).
However, sometimes we need to reason about the exact representation of a tree of cells present in a block. To this end, a {\em representation hash\/} $\Hash^\sharp(c)$ is defined, which is not transparent with respect to reference cells and other exotic types of cells. We often say that the representation hash of~$c$ is ``the'' hash of~$c$, because it is the most frequently used hash of a cell.
\nxsubpoint\label{sp:sign.repr.hash}\emb{Use of representation hashes for signatures}
Signatures are an excellent example of the application of representation hashes. For instance:
\begin{itemize}
\item Validators sign the representation hash of a block, not just its transparent hash, because they need to certify that the block does contain the required data, not just some external references to them.
\item When external messages are signed and sent by off-chain parties (e.g., human clients using an application to initiate blockchain transactions), if external references may be present in some of these messages, it is the representation hashes of the messages that must be signed.
\end{itemize}
\nxsubpoint\emb{Higher hashes of a cell}
In addition to the transparent and representation hashes of a cell~$c$, a sequence of {\em higher hashes\/} $\Hash_i(c)$, $i=1,2,\dots$ may be defined, which eventually stabilizes at $\Hash_\infty(c)$. (More detail may be found in~\cite[3.1]{TVM}.)
\mysubsection{Principal components of a block and the blockchain state}
This section briefly describes the principal components of a block and of the blockchain state, without delving too much into the details.
\nxsubpoint\label{sp:isp.blk.state}\emb{The Infinite Sharding Paradigm (ISP) applied to blockchain block and state}
Recall that according to the Infinite Sharding Paradigm, each account can be considered as lying in its separate ``accountchain'', and the (virtual) blocks of these accountchains are then grouped into shardchain blocks for efficiency purposes. Specifically, the state of a shardchain consists, roughly speaking, of the states of all its ``accountchains'' (i.e., of all accounts assigned to it); similarly, a block of a shardchain essentially consists of a collection of virtual ``blocks'' for some accounts assigned to the shardchain.\footnote{If there are no transactions related to an account, the corresponding virtual block is empty and is omitted in the shardchain block}
We can summarize this as follows:
\begin{align}\label{eq:sstate.approx}
\textit{ShardState}&\approx\Hashmap(n,\textit{AccountState})\\
\textit{ShardBlock}&\approx\Hashmap(n,\textit{AccountBlock})
\end{align}
where $n$ is the bit length of the $\accountid$, and $\Hashmap(n,X)$ describes a partial map $\st2^n\dashrightarrow X$ from bitstrings of length $n$ into values of type~$X$.
Recall that each shardchain---or, more precisely, each shardchain block\footnote{Recall that TON Blockchain supports {\em dynamic\/} sharding, so the shard configuration may change from block to block because of shard merge and split events. Therefore, we cannot simply say that each shardchain corresponds to a fixed set of accountchains.}---corresponds to all accountchains that belong to the same ``workchain'' (i.e., have the same $\workchainid=w$) and have an $\accountid$ beginning with the same binary prefix $s$, so that $(w,s)$ completely determines a shard. Therefore, the above hashmaps must contain only keys beginning with prefix~$s$.
We will see in a moment that the above description is only an approximation: the state and block of the shardchain need to contain some extra data that are not split according to the $\accountid$ as suggested by~\eqref{eq:sstate.approx}.
\nxsubpoint\label{sp:split.blk.part}\emb{Split and non-split part of the shardchain block and state}
A shardchain block and its state may each be classified into two distinct parts. The parts with the ISP-dictated form of \eqref{eq:sstate.approx} will be called the {\em split\/} parts of the block and its state, while the remainder will be called the {\em non-split\/} parts.
\nxsubpoint\label{sp:blk.inter.1}\emb{Interaction with other blocks and the outside world. Global and local consistency conditions}
The non-split parts of the shardchain block and its state are mostly related to the interaction of this block with some other ``neighboring'' blocks. The global consistency conditions of the blockchain as a whole are reduced to internal consistency conditions of separate blocks by themselves as well as external local consistency conditions between certain blocks (cf.~\ptref{p:cons.cond}).
Most of these local consistency conditions are related to message forwarding between different shardchains, transactions involving more than one shardchain, and message delivery guarantees. However, another group of local consistency conditions relates a block with its immediate antecessors and successors inside a shardchain; for instance, the initial state of a block usually must coincide with the final state of its immediate antecessor.\footnote{This condition applies if there is exactly one immediate antecessor (i.e., if a shardchain merge event did not occur immediately before the block in question); otherwise, this condition becomes more convoluted.}
\nxsubpoint\emb{Inbound and outbound messages of a block}
The most important components of the non-split part of a shardchain block are the following:
\begin{itemize}
\item {\em InMsgDescr} --- The description of all messages ``imported'' into this block (i.e., either processed by a transaction included in the block, or forwarded to an output queue, in the case of a transit message travelling along the path dictated by Hypercube Routing).
\item {\em OutMsgDescr} --- The description of all messages ``exported'' or ``generated'' by the block (i.e., either messages generated by a transaction included in the block, or transit messages with destination not belonging to the current shardchain, forwarded from {\em InMsgDescr}).
\end{itemize}
\nxsubpoint\label{sp:blk.hdr}\emb{Block header}
Another non-split component of a shardchain block is the {\em block header}, which contains general information such as $(w,s)$ (i.e., the $\workchainid$ and the common binary prefix of all $\accountid$s assigned to the current shardchain), the block's {\em sequence number\/} (defined to be the smallest non-negative integer larger than the sequence numbers of its predecessors), {\em logical time}, and {\em generation unixtime}. It also contains the hash of the immediate antecessor of the block (or of its two immediate antecessors in the case of a preceding shardchain merge event), the hashes of its initial and final states (i.e., of the states of the shardchain immediately before and immediately after processing the current block), and the hash of the most recent masterchain block known when the shardchain block was generated.
\nxsubpoint\label{sp:val.sign}\emb{Validator signatures, signed and unsigned blocks}
The block described so far is an {\em unsigned block}; it is generated in its entirety and considered as a whole by the validators. When the validators ultimately sign it, the {\em signed block} is created, consisting of the unsigned block along with a list of validator signatures (of a certain representation hash of the unsigned block, cf.~\ptref{sp:sign.repr.hash}). This list of signatures is also a non-split component of the (signed) block; however, since it lies outside the unsigned block, it is somewhat different from the other data kept in a block.
\nxsubpoint\label{sp:outmsgq}\emb{Outbound message queue of a shardchain}
Similarly, the most important non-split part of the shardchain state is {\em OutMsgQueue}, the outbound message queue. It contains {\em undelivered\/} messages included into {\em OutMsgDescr\/}, either by the last shardchain block leading to this state or by one of its antecessors.
Originally, each outbound message is included into {\em OutMsgQueue}; it is removed from the queue only after it has either been included into the {\em InMsgDescr\/} of a block of a ``neighboring'' shardchain (the next one with respect to Hypercube Routing), or has been delivered to (i.e., has appeared in the {\em InMsgDescr\/} of) its ultimate destination shardchain via Instant Hypercube Routing. In both cases, the {\em reason\/} for the removal of a message from the {\em OutMsgQueue\/} is made explicit in the {\em OutMsgDescr\/} of the block in which such a state transformation has occurred.
\nxsubpoint\emb{Layout of {\em InMsgDescr}, {\em OutMsgDescr} and {\em OutMsgQueue}}
All of the most important non-split shardchain data structures related to messages are organized as {\em hashmaps\/} or {\em dictionaries\/} (implemented by means of Patricia trees serialized into a tree of cells as described in \cite[3.3]{TVM}), with the following keys:
\begin{itemize}
\item The inbound message description {\em InMsgDescr\/} uses the 256-bit message hash as a key.
\item The outbound message description {\em OutMsgDescr\/} uses the 256-bit message hash as a key.
\item The outbound message queue {\em OutMsgQueue\/} uses the 352-bit concatenation of the 32-bit destination $\workchainid$, the first 64 bits of destination address $\accountid$, and the 256-bit message hash as a key.
\end{itemize}
\nxsubpoint\emb{The split part of the block: transaction chains}
The split part of a shardchain block consists of a hashmap mapping some of the accounts assigned to the shardchain to ``virtual accountchain blocks'' {\em AccountBlock}, cf.~\eqref{eq:sstate.approx}. Such a virtual accountchain block consists of a sequential list of {\em transactions} related to that account.
\nxsubpoint\emb{Transaction description}
Each transaction is described in the block by an instance of the {\em Transaction\/} type, which contains in particular the following information:
\begin{itemize}
\item A reference to exactly one {\em inbound message\/} (which must be present in {\em InMsgDescr\/} as well) that has been {\em processed\/} by the transaction.
\item References to several (maybe zero) {\em outbound messages\/} (also present in {\em OutMsgDescr\/} and most likely included in {\em OutMsgQueue}) that have been {\em generated\/} by the transaction.
\end{itemize}
The transaction consists of an invocation of TVM (cf. \cite{TVM}) with the code of the smart contract corresponding to the account in question loaded into the virtual machine, and with the data root cell of the smart contract loaded into the virtual machine's register \texttt{c4}. The inbound message itself is passed in the stack as an argument to the smart contract's {\tt main()} function, along with some other important data, such as the amount of TON Grams and other defined currencies attached to the message, the sender account address, the current balance of the smart contract, and so on.
In addition to the information listed above, a {\em Transaction\/} instance also contains the original and final states of the account (i.e., of the smart contract), as well as some of the TVM running statistics (gas consumed, gas price, instructions performed, cells created/destroyed, virtual machine termination code, etc.).
\nxsubpoint\emb{The split part of the shardchain state: account states}
Recall that, according to \eqref{eq:sstate.approx}, the split part of the shardchain state consists of a hashmap mapping each ``defined'' account identifier (belonging to the shardchain in question) to the {\em state\/} of the corresponding account, given by an instance of the {\em AccountState\/} type.
\nxsubpoint\emb{Account state}
The account state itself approximately consists of the following data:
\begin{itemize}
\item Its {\em balance} in Grams and (optionally) in some other defined cryptocurrencies/tokens.
\item The {\em smart-contract code}, or the hash of the smart-contract code if it will be provided (uploaded) later by a separate message.
\item The persistent {\em smart-contract data}, which can be empty for simple smart contracts. It is a tree of cells, the root of which is loaded into register {\tt c4} during smart-contract execution.
\item Its {\em storage usage statistics}, including the number of cells and bytes kept in the persistent storage of the smart contract (i.e., inside the blockchain state) and the last time a storage usage payment was exacted from this account.
\item An optional {\em formal interface description} (intended for smart contracts) and/or {\em user public information} (intended mostly for human users and organizations).
\end{itemize}
Notice that there is no distinction between ``smart contract'' and ``account'' in the TON Blockchain. Instead, ``simple'' or ``wallet'' accounts, typically employed by human users and their cryptocurrency wallet applications for simple cryptocurrency transfers, are just simple smart contracts with standard (shared) code and with persistent data consisting of the public key of the wallet (or several public keys in the case of a multi-signature wallet; cf.~\ptref{sp:ex.simple.wallet} for more detail).
\nxsubpoint\emb{Masterchain blocks}
In addition to shardchain blocks and their states, the TON Blockchain contains {\em masterchain blocks\/} and the {\em masterchain state\/} (also called the {\em global state}). The masterchain blocks and state are quite similar to the shardchain blocks and state considered so far, with some notable differences:
\begin{itemize}
\item The masterchain cannot be split or merged, so a masterchain block usually has exactly one immediate antecessor. The sole exception is the ``masterchain block zero'', distinguished by having a sequence number equal to zero; it has no antecessors at all, and contains the initial configuration of the whole TON Blockchain (e.g., the original set of validators).
\item The masterchain blocks contain another important non-split structure: {\em ShardHashes}, a binary tree with a list of all defined shardchains along with the hashes of the latest block inside each of the listed shardchains. It is the inclusion of a shardchain block into this structure that makes a shardchain block ``canonical'', and enables other shardchains' blocks to refer to data (e.g., outbound messages) contained in the shardchain block.
\item The state of the masterchain contains global configuration parameters of the whole TON Blockchain, such as the minimum and maximum gas prices, the supported versions of TVM, the minimum stake for the validator candidates, the list of alternative cryptocurrencies supported in addition to Grams, the total amount of Grams issued so far, and the current set of validators responsible for creating and signing new blocks, along with their public keys.
\item The state of the masterchain also contains the code of the smart contracts used to elect the subsequent sets of validators and to modify the global configuration parameters. The code of these smart contracts itself is a part of the global configuration parameters and can be modified accordingly. In this respect, this code (along with the current values of these parameters) functions like a ``constitution'' for the TON Blockchain. It is initially established in masterchain block zero.
\item There are no transit messages through the masterchain: each inbound message must have a destination inside the masterchain, and each outbound message must have a source inside the masterchain.
\end{itemize}
\mysubsection{Consistency conditions}\label{p:cons.cond}
In addition to the data structures contained in the block and in the blockchain state, which are serialized into bags of cells according to certain TL-B schemes explained in detail later (cf. Chapters \ptref{sect:msg}--\ptref{sect:block.layout}), an important component of the blockchain layout is the {\em consistency conditions\/} between data kept inside one or in different blocks (as mentioned in~\ptref{sp:blk.inter.1}). This section describes in detail the function of consistency conditions in the blockchain.
\nxsubpoint\emb{Expressing consistency conditions}
In principle, dependent data types (such as those used in TL-B) could be used not only to describe the serialization of block data, but also to express conditions imposed on the components of such data types. (For instance, one could define data type \textit{OrderedIntPair}, with pairs of integers $(x,y)$, such that $x<y$, as values.) However, TL-B currently is not expressive enough to encode all the consistency conditions we need, so we opt for a semi-formalized approach in this text. In the future, we may present a subsequent complete formalization in a suitable proof assistant such as Coq.
\nxsubpoint\emb{Importance of consistency conditions}
The consistency conditions ultimately are at least as important as the ``unrestricted'' data structures on which they are imposed, especially in the blockchain context. For instance, the consistency conditions ensure that the state of an account does not change between blocks, and that it can change within a block only as a result of a transaction. In this way, the consistency conditions ensure the safe storage of cryptocurrency balances and other information inside the blockchain.
\nxsubpoint\emb{Kinds of consistency conditions}
There are several kinds of consistency conditions imposed on the TON Blockchain:
\begin{itemize}
\item {\em Global conditions} --- Express the invariants throughout the entire TON Blockchain. For instance, the {\em message delivery guarantees}, which assert that each message generated must be delivered to its destination account and delivered exactly once, are part of the global conditions.
\item {\em Internal (local) conditions} --- Express the conditions imposed on the data kept inside one block. For example, each transaction included in the block (i.e., present in the transaction list of some account) processes exactly one inbound message; this inbound message must be listed in the {\em InMsgDescr\/} structure of the block as well.
\item {\em External (local) conditions} --- Express the conditions imposed on the data of different blocks, usually belonging to the same or to neighboring shardchains (with respect to Hypercube Routing). Therefore, the external conditions come in several flavors:
\begin{itemize}
\item {\em Antecessor/successor conditions} --- Express the conditions imposed on the data of some block and of its immediate antecessor or (in the case of a preceding shardchain merge event) two immediate antecessors. The most important of these conditions is the one stating that the initial state for a shardchain block must coincide with final shardchain state of the immediate antecessor block, provided no shardchain split/merge event happened in between.
\item {\em Masterchain/shardchain conditions} --- Express the conditions imposed on a shardchain block and on the masterchain block that refers to it in its {\em ShardHashes\/} list or is referred to in the header of the shardchain block.
\item {\em Neighbor (block) conditions} --- Express the relations between the blocks of neighboring shardchains with respect to Hypercube Routing. The most important of these conditions express the relation between the {\em InMsgDescr\/} of a block and the {\em OutMsgQueue\/} of the state of a neighboring block.
\end{itemize}
\end{itemize}
\nxsubpoint\emb{Decomposition of global and local conditions into simpler local conditions}
The {\em global\/} consistency conditions, such as the message delivery guarantees, are truly necessary for the blockchain to work properly; however, they are hard to enforce and verify directly. Therefore, we instead introduce a lot of simpler {\em local\/} consistency conditions, which are easier to enforce and verify since they involve only one block, or perhaps two adjacent blocks. These local conditions are chosen in such a fashion that the desired global conditions are logical consequences of (the conjunction of) all the local conditions. In this respect, we say that the global conditions have been ``decomposed'' into simpler local conditions.
Sometimes a local condition still turns out to be too cumbersome to enforce or verify. In that case it is decomposed further, into even simpler local conditions.
\nxsubpoint\emb{Decomposition may require additional data structures and additional internal consistency conditions}
The decomposition of a condition into simpler local consistency conditions sometimes requires the introduction of additional data structures. For example, the {\em InMsgDescr\/} explicitly lists all inbound messages processed in a block, even if this list might have been obtained by scanning the list of all the transactions present in the block. However, {\em InMsgDescr\/} greatly simplifies the neighbor conditions related to message forwarding and routing, which ultimately add up to the global message delivery guarantees.
Notice that the introduction of such additional data structures is a sort of ``database denormalization'' (i.e., it leads to some redundancy, or to some data being present more than once), and therefore more internal consistency conditions need to be imposed (e.g., if some data are now present in two copies, we must require that these two copies coincide). For instance, once we introduce {\em InMsgDescr\/} to facilitate message forwarding between shardchains, we need to introduce internal consistency conditions relating {\em InMsgDescr\/} to the transaction list of the same block.
\nxsubpoint\emb{Correct serialization conditions}
Apart from the high-level internal consistency conditions, which treat the contents of a block as a value of an abstract data type, there are some lower-level internal consistency conditions, called ``(correct) serialization conditions'', which ensure that the tree of cells present in the block is indeed a valid serialization of a value of the expected abstract data type. Such serialization conditions can be automatically generated from the TL-B scheme describing the abstract data type and its serialization into a tree of cells.
Notice that the serialization conditions are a set of mutually recursive predicates on cells or cell slices. For example, if a value of type $A$ consists of a 32-bit magic number $m_A$, a 64-bit integer $l$, and two references to cells containing values of types $B$ and $C$, respectively, then the correct serialization condition for values of type $A$ will require a cell or a cell slice to contain exactly 96 data bits and two cell references $r_1$ and~$r_2$, with the additional requirements that the first 32 data bits contain $m_A$, and the two cells referred to by $r_1$ and $r_2$ satisfy the serialization conditions for values of types $B$ and~$C$, respectively.
\nxsubpoint\label{sp:c.exist.elim}
\emb{Constructive elimination of existence quantifiers}
The local conditions one might want to impose sometimes are {\em non-constructible}, meaning that they do not necessarily contain an explanation of why they are true. A typical example of such a condition $C$ is given by
\begin{equation}\label{eq:nonc.sample}
C:\equiv\forall_{(x:X)}\exists_{(y:Y)}A(x,y)\quad,
\end{equation}
``for any $x$ from $X$, there is a $y$ from $Y$ such that condition $A(x,y)$ holds''. Even if we know $C$ to be true, we do not have a way of quickly finding a $y:Y$, such that $A(x,y)$, for a given $x:X$. As a consequence, the verification of $C$ may be quite time-consuming.
In order to simplify the verification of local conditions, they are made {\em constructible\/} (i.e., verifiable in bounded time) by adding some {\em witness\/} data structures. For instance, condition $C$ of \eqref{eq:nonc.sample} may be transformed by adding a new data structure $f:X\to Y$ (a map $f$ from $X$ to $Y$) and imposing the following condition $C'$ instead:
\begin{equation}
C':\equiv\forall_{(x:X)}A\bigl(x,f(x)\bigr)\quad.
\end{equation}
Of course, the ``witness'' value $f(x):Y$ may be included inside the (modified) data type $X$ instead of being kept in a separate table~$f$.
\nxsubpoint\label{sp:ex.exist.elim}\emb{Example: consistency condition for {\em InMsgDescr}}
For instance, the consistency condition between $X:=\textit{InMsgDescr}$, the list of all inbound messages processed in a block, and $Y:=\textit{Transactions}$, the list of all transactions present in a block, is of the above sort: ``For any input message $x$ present in \textit{InMsgDescr}, a transaction $y$ must be present in the block such that $y$ processes $x$''.\footnote{This example is a bit simplified since it does not take into account the presence of transit messages in \textit{InMsgDescr}, which are not processed by any explicit transaction.} The procedure of $\exists$-elimination described in \ptref{sp:c.exist.elim} leads us to introduce an additional field in the inbound message descriptors of \textit{InMsgDescr}, containing a reference to the transaction in which the message is actually processed.
\nxsubpoint\label{sp:c.disj.elim}
\emb{Constructive elimination of logical disjunctions}
Similarly to the transformation described in~\ptref{sp:c.exist.elim}, condition
\begin{equation}
D:\equiv\forall_{(x:X)}\bigl(A_1(x)\vee A_2(x)\bigr)\quad,
\end{equation}
``for all $x$ from $X$, at least one of $A_1(x)$ and $A_2(x)$ holds'', may be transformed into a function $i:X\to\st2=\{1,2\}$ and a new condition
\begin{equation}
D':\equiv\forall_{(x:X)}A_{i(x)}(x)
\end{equation}
This is a special case of the existential quantifier elimination considered before for $Y=\st2=\{1,2\}$. It may be useful when $A_1(x)$ and $A_2(x)$ are complicated conditions that cannot be verified quickly, so that it is useful to know in advance which of them is in fact true.
For instance, \textit{InMsgDescr\/}, as considered in~\ptref{sp:ex.exist.elim}, can contain both messages processed in the block and transit messages. We might introduce a field in the inbound message description to indicate whether the message is transit or not, and, in the latter case, include a witness field for the transaction processing the message.
\nxsubpoint\label{sp:cond.cvize}\emb{Constructivization of conditions}
This process of eliminating the non-constructible logical binders $\exists$ (existence quantifier) and (sometimes) $\vee$ (logical disjunction) by introducing additional data structures and fields---that is, the process of making a condition constructible---will be called {\em constructivization}. If taken to its theoretical limit, this process leads to logical formulas containing only universal quantifiers and logical conjunctions, at the expense of adding some witness fields into certain data structures.
\nxsubpoint\emb{Validity conditions for a block}
Ultimately, all of the internal conditions for a block, along with the local antecessor and neighbor conditions involving this block and another previously generated block, constitute the {\em validity conditions\/} for a shardchain or masterchain block. A block is {\em valid\/} if it satisfies the validity conditions. It is the responsibility of validators to generate valid blocks, as well as check the validity of blocks generated by other validators.
\nxsubpoint\emb{Witnesses of the invalidity of a block}
If a block does not satisfy all of the validity conditions $C_1$, \dots, $C_n$ (i.e., the conjunction $V:\equiv\bigwedge_i C_i$ of the validity conditions), it is {\em invalid}. This means that it satisfies the ``invalidity condition'' $\neg V=\bigvee_i\neg C_i$. If all of the $C_i$---and hence, also $V$---have been ``constructivized'' in the sense described in~\ptref{sp:cond.cvize}, so that they contain only logical conjunctions and universal quantifiers (and simple atomic propositions), then $\neg V$ contains only logical disjunctions and existential quantifiers. Then a constructivization of $\neg V$ may be defined, which would involve an {\em invalidity witness}, starting with an index $i$ of the specific validity condition $C_i$ which fails.
Such invalidity witnesses may also be serialized and presented to other validators or committed into the masterchain to prove that a specific block or block candidate is in fact invalid. Therefore, the construction and serialization of invalidity witnesses is an important part of a Proof-of-Stake (PoS) blockchain design.\footnote{It is interesting to note that this part of the work can be done almost automatically.}
\nxsubpoint\emb{Minimizing the size of witnesses}
An important consideration for the design of the local conditions, their decomposition into simpler conditions, and their constructivization is to make the verification of each condition as simple as possible. However, another requirement is that we should minimize the size of witnesses both for a condition (so that block size does not grow too much during the constructivization process) and for its negation (so that the invalidity proofs have bounded size, which simplifies their verification, transmission, and inclusion into the masterchain). These two design principles are sometimes at odds, and a compromise must be then sought.
\nxsubpoint\emb{Minimizing the size of Merkle proofs}
The consistency conditions are originally intended to be processed by a party who already has all the relevant data (e.g., all the blocks mentioned in the condition). On some occasions, however, they must be verified by a party who does not have all the blocks in question, but knows only their hashes. For example, suppose that a block invalidity proof were augmented by the signature of a validator that had signed an invalid block (and therefore would have to be punished). In this case, the signature would contain only the hash of the wrongly signed block; the block itself would have to be recovered from a different place before verifying the block invalidity proof.
A compromise between providing only the hash of the supposedly invalid block and providing the entire invalid block along with the invalidity witness is to augment the invalidity witness by a Merkle proof starting from the hash of the block (i.e., of the root cell of the block). Such a proof would include all the cells referred to in the invalidity witness, along with all the cells on the paths from these cells to the root cells and the hashes of their siblings. Then an invalidity proof becomes self-contained enough to provide sufficient justification on its own for punishing a validator. For example, the invalidity proof suggested above might be presented to a smart contract residing in the masterchain that punishes the validators for incorrect behavior.
Since such an invalidity proof must be augmented by a Merkle proof, it makes sense to write the consistency conditions so that the Merkle proofs for their negations would be as small as possible. In particular, each individual condition must be as ``local'' as possible (i.e., involve a minimal number of cells). This also optimizes the verification time of the invalidity proof.
\nxsubpoint\emb{Collated data for the external conditions}
When a validator suggests an unsigned block to the other validators of a shardchain, these other validators must check the validity of this block candidate---i.e., verify that it satisfies all of the internal and external local consistency conditions. While the internal conditions do not require any extra data in addition to the block candidate itself, the external conditions need some other blocks, or at least some information out of those blocks. Such additional information may be extracted from those blocks, along with all cells on the paths from the cells containing the required additional information to the root cell of the corresponding blocks and the hashes of the siblings of the cells on these paths, to present a Merkle proof that can be processed without knowledge of the referred blocks themselves.
This additional information, called {\em collated data}, is serialized as a bag of cells and presented by the validator along with the unsigned block candidate itself. The block candidate along with the collated data is called a {\em collated block}.
\nxsubpoint\emb{Conditions for a collated block}
The {\em external\/} consistency conditions for a block candidate are thus (automatically) transformed into {\em internal\/} consistency conditions for a collated block, which greatly simplifies and speeds up their verification by the other validators. However, some data---such as the final state of the immediate antecessor of the block being validated---is not collated. Instead, all validators are supposed to keep a local copy of this data.
\nxsubpoint\emb{Representation conditions and representation hashes}
Notice that once Merkle proofs are included into a collated block, the consistency conditions must take into account which data (i.e., which cells) are actually present in the collated block, and not just referred to by their hashes. This leads to a new group of conditions, called {\em representation conditions}, which must be able to distinguish an external cell reference (usually represented by its 256-bit hash) from the cell itself. A validator can be punished for suggesting a collated block that does not contain all of the expected collated data inside, even if the block candidate itself is valid.
This also leads to the utilization of {\em representation hashes} instead of {\em transparent hashes} for collated blocks.
\nxsubpoint\emb{Verification in the absence of the collated data}
Notice that a block must still be verifiable in the absence of the collated data; otherwise, no party except the validators would be able to check a previously committed block by its own means. In particular, witnesses cannot be included into the collated data: they must reside in the block itself. The collated data must contain only some portions of neighboring blocks referred to in the principal block along with suitable Merkle proofs, which can be reconstructed by anybody who has the referenced blocks themselves.
\nxsubpoint\emb{Inclusion of Merkle proofs in the block itself}
Notice that on some occasions Merkle proofs must be embedded into the block itself, and not just into collated data. For instance:
\begin{itemize}
\item During Instant Hypercube Routing (IHR), a message may be included directly into the \textit{InMsgDescr\/} of a block of the destination shardchain, without travelling all the way along the edges of the hypercube. In this case, a Merkle proof of the existence of the message in the \textit{OutMsgDescr\/} of a block of the originating shardchain must be included into \textit{InMsgDescr\/} along with the message itself.
\item An invalidity proof, or another proof of validator misbehavior, may be committed into the masterchain by including it in the body of a message sent to a special smart contract. In this case, the invalidity proof must include some cells along with a Merkle proof, which must therefore be contained in a message body.
\item Similarly, a smart contract defining a payment channel, or another kind of side-chain, may accept finalization messages or misbehavior proof messages that contain suitable Merkle proofs.
\item The final state of a shardchain is not included into a shardchain block. Instead, only the cells that have been modified are included; those cells that are inherited from the old state are referred to by their hashes, along with suitable Merkle proofs consisting of the cells on the path from the root of the old state to the cells of the old state referred to.
\end{itemize}
\nxsubpoint\emb{Provisions for handling incomplete data}
As we have seen, it is necessary to include incomplete data and Merkle proofs into the body of a block, into the body of some messages contained in a block, and into the state. This necessity is reflected by some extra representation conditions, as well as provisions for the messages (and by extension, the cell trees processed by TVM) to contain incomplete data (external cell references and Merkle proofs). In most cases, such external cell references contain only the 256-bit $\Sha$ hash of a cell along with a flag; if a smart contract attempts to inspect the contents of such a cell by a {\tt CTOS} primitive (e.g., for deserialization), an exception is triggered. However, an external reference to such a cell can be stored into the smart contract's persistent storage, and both the transparent and the representation hashes of such a cell can be computed.
\mysubsection{Logical time and logical time intervals}
This section takes a closer look at so-called {\em logical time}, extensively used in the TON Blockchain for message forwarding and message delivery guarantees, among other purposes.
\nxsubpoint\label{sp:logic.time}\emb{Logical time}
A component of the TON Blockchain that also plays an important role in message delivery is the {\em logical time}, usually denoted by $\LT$. It is a non-negative 64-bit integer, assigned to certain events roughly as follows:
\begin{quote}
If an event $e$ logically depends on events $e_1$, \dots, $e_n$, then $\LT(e)$ is the smallest non-negative integer greater than all $\LT(e_i)$.
\end{quote}
In particular, if $n=0$ (i.e., if $e$ does not depend on any prior events), then $\LT(e)=0$.
\nxsubpoint\label{sp:logic.time.relaxed}\emb{A relaxed variant of logical time}
On some occasions we relax the definition of logical time, requesting only that
\begin{equation}\label{eq:lt.fund.ineq}
\LT(e)>\LT(e')\quad\text{whenever $e\succ e'$ (i.e., $e$ logically depends on $e'$),}
\end{equation}
without insisting that $\LT(e)$ be the smallest non-negative integer with this property. In such cases we can speak about {\em relaxed\/} logical time, as opposed to the {\em strict\/} logical time defined above (cf.~\ptref{sp:logic.time}). Notice, however, that the condition~\eqref{eq:lt.fund.ineq} is a fundamental property of logical time and cannot be relaxed further.
\nxsubpoint\label{sp:logic.time.interval}\emb{Logical time intervals}
It makes sense to assign to some events or collections of events $C$ an {\em interval\/} of logical times $\LT^\bullet(C)=[\LT^-(C),\LT^+(C))$, meaning that the collection of events $C$ took place in the specified ``interval'' of logical times, where $\LT^-(C)<\LT^+(C)$ are some integers (64-bit integers in practice). In this case, we can say that $C$ {\em begins\/} at logical time $\LT^-(C)$, and {\em ends\/} at logical time $\LT^+(C)$.
By default, we assume $\LT^+(e)=\LT(e)+1$ and $\LT^-(e)=\LT(e)$ for simple or ``atomic'' events, assuming that they last exactly one unit of logical time. In general, if we have a single value $\LT(C)$ as well as logical time interval $\LT^\bullet(C)=[\LT^-(C),\LT^+(C))$, we always require that
\begin{equation}
\LT(C)\in[\LT^-(C),\LT^+(C))
\end{equation}
or, equivalently,
\begin{equation}
\LT^-(C)\leq\LT(C)<\LT^+(C)
\end{equation}
In most cases, we choose $\LT(C)=\LT^-(C)$.
\nxsubpoint\label{sp:lt.int.cond}\emb{Requirements for logical time intervals}
The three principal requirements for logical time intervals are:
\begin{itemize}
\item $0\leq\LT^-(C)<\LT^+(C)$ are non-negative integers for any collection of events~$C$.
\item If $e'\prec e$ (i.e., if an atomic event $e$ logically depends on another atomic event $e'$), then $\LT^\bullet(e')<\LT^\bullet(e)$ (i.e., $\LT^+(e')\leq\LT^-(e)$).
\item If $C\supset D$ (i.e., if a collection of events $C$ contains another collection of events $D$), then $\LT^\bullet(C)\supset\LT^\bullet(D)$, i.e.,
\begin{equation}
\LT^-(C)\leq\LT^-(D)<\LT^+(D)\leq\LT^+(C)
\end{equation}
In particular, if $C$ consists of atomic events $e_1$, \dots, $e_n$, then $\LT^-(C)\leq\inf_i\LT^-(e_i)\leq\inf_i\LT(e_i)$ and $\LT^+(C)\geq\sup_i\LT^+(e_i)\geq 1+\sup_i\LT(e_i)$.
\end{itemize}
\nxsubpoint\emb{Strict, or minimal, logical time intervals}
One can assign to any finite collection of atomic events $E=\{e\}$ related by a causality relation (partial order) $\prec$, and all subsets $C\subset E$, {\em minimal\/} logical time intervals. That is, among all assignments of logical time intervals satisfying the conditions listed in \ptref{sp:lt.int.cond}, we choose the one having all $\LT^+(C)-\LT^-(C)$ as small as possible, and if several assignments with this property exist, we choose the one that has the minimum $\LT^-(C)$ as well.
Such an assignment can be achieved by first assigning logical time $\LT(e)$ to all atomic events $e\in E$ as described in \ptref{sp:logic.time}, then setting $\LT^-(C):=\inf_{e\in C}\LT(e)$ and $\LT^+(C):=1+\sup_{e\in C}\LT(e)$ for any $C\subset E$.
In most cases when we need to assign logical time intervals, we use the minimal logical time intervals just described.
\nxsubpoint\label{sp:lt.ton.blkch}\emb{Logical time in the TON Blockchain}
The TON Blockchain assigns logical time and logical time intervals to several of its components.
For instance, each outbound message created in a transaction is assigned its {\em logical creation time}; for this purpose, the creation of an outbound message is considered an atomic event, logically dependent on the previous message created by the same transaction, as well as on the previous transaction of the same account, on the inbound message processed by the same transaction, and on all events contained in the blocks referred to by hashes contained in the block with the same transaction. As a consequence, {\em outbound messages created by the same smart contract have strictly increasing logical creation times.} The transaction itself is considered a collection of atomic events, and is assigned a logical time interval (cf.~\ptref{sp:trans.lt} for a more precise description).
Each block is a collection of transaction and message creation events, so it is assigned a logical time interval, explicitly mentioned in the header of the block.
\mysubsection{Total blockchain state}
This section discusses the total state of the TON Blockchain, as well as the states of separate shardchains and the masterchain. For example, the precise definition of the state of the neighboring shardchains becomes crucial for correctly formalizing the consistency condition asserting that the validators for a shardchain must import the oldest messages from the union of {\em OutMsgQueue\/}s taken from the states of all neighboring shardchains (cf.~\ptref{sp:monot.import}).
\nxsubpoint\emb{Total state defined by a masterchain block}
Every masterchain block contains a list of all currently active shards and of the latest blocks for each of them. In this respect, {\em every masterchain block defines the corresponding total state of the TON Blockchain, since it fixes the state of every shardchain, and of the masterchain as well.}
An important requirement imposed on this list of the latest blocks for all shardchain blocks is that, if a masterchain block $B$ lists $S$ as the latest block of some shardchain, and a newer masterchain block $B'$, with $B$ as one of its antecessors, lists $S'$ as the latest block of the same shardchain, then $S$ must be one of the antecessors of $S'$.\footnote{In order to express this condition correctly in the presence of dynamic sharding, one should fix some account $\xi$, and consider the latest blocks $S$ and $S'$ of the shardchains containing $\xi$ in the shard configurations of both $B$ and $B'$, since the shards containing $\xi$ might be different in $B$ and $B'$.} This condition makes the total state of the TON blockchain defined by a subsequent masterchain block $B'$ compatible with the total state defined by a previous block $B$.
\nxsubpoint\label{sp:shard.total.state}\emb{Total state defined to by a shardchain block}
Every shardchain block contains the hash of the most recent masterchain block in its header. Consequently, all the blocks referred to in that masterchain block, along with their antecessors, are considered ``known'' or ``visible'' to the shardchain block, and no other blocks are visible to it, with the sole exception of its antecessors inside its proper shardchain.
In particular, when we say that a block {\em must\/} import in its {\em InMsgDescr\/} the messages from the {\em OutMsgQueue\/} of the states of all neighboring shardchains, it means that precisely the blocks of other shardchains visible to that block must be taken into account, and at the same time the block cannot contain messages from ``invisible'' blocks, even if they are otherwise correct.
\mysubsection{Configurable parameters and smart contracts}\label{p:conf.params}
Recall that the TON Blockchain has several so-called ``configurable parameters'' (cf.~\cite{TON}), which are either certain values or certain smart contracts residing in the masterchain. This section discusses the storage of and access to these configurable parameters.
\nxsubpoint\emb{Examples of configurable parameters}
The properties of the blockchain controlled by configurable parameters include:
\begin{itemize}
\item The minimum stake for validators.
\item The maximum size of the group of elected validators.
\item The maximum number of blocks for which the same group of validators are responsible.
\item The validator election process.
\item The validator punishing process.
\item The currently active and the next elected set of validators.
\item The process of changing configurable parameters, and the address of the smart contract $\gamma$ responsible for holding the values of the configurable parameters and for modifying their values.
\end{itemize}
\nxsubpoint\emb{Location of the values of configurable parameters}
The configurable parameters are kept in the persistent data of a special configuration smart contract $\gamma$ residing in the masterchain of the TON Blockchain. More precisely, the first reference of the root cell of the persistent data of that smart contract is a dictionary mapping 64-bit keys (parameter numbers) to the values of the corresponding parameters; each value is serialized into a cell slice according to the type of that value. If a value is a ``smart contract'' (necessarily residing in the masterchain), its 256-bit account address is used instead.
\nxsubpoint\label{sp:conf.par.qa}\emb{Quick access through the header of masterchain blocks}
To simplify access to the current values of configurable parameters, and to shorten the Merkle proofs containing references to them, the header of each masterchain block contains the address of smart contract $\gamma$. It also contains a direct cell reference to the dictionary containing all values of configurable parameters, which lies in the persistent data of~$\gamma$. Additional consistency conditions ensure that this reference coincides with the one obtained by inspecting the final state of smart contract~$\gamma$.
\nxsubpoint\emb{Getting values of configurable parameters by get methods}
The configuration smart contract $\gamma$ provides access to some of configurable parameters by means of ``get methods''. These special methods of the smart contract do not change its state, but instead return required data in the TVM stack.
\nxsubpoint\emb{Getting values of configurable parameters by get messages}
Similarly, the configuration smart contract $\gamma$ may define some ``ordinary'' methods (i.e., special inbound messages) to request the values of certain configuration parameters, which will be sent in the outbound messages generated by the transaction processing such an inbound message. This may be useful for some other fundamental smart contracts that need to know the values of certain configuration parameters.
\nxsubpoint\emb{Values obtained by get methods may be different from those obtained through the block header}
Notice that the state of the configuration smart contract~$\gamma$, including the values of configurable parameters, may change several times inside a masterchain block, if there are several transactions processed by~$\gamma$ in that block. As a consequence, the values obtained by invoking get methods of~$\gamma$, or sending get messages to $\gamma$, may be different from those obtained by inspecting the reference in the block header (cf.~\ptref{sp:conf.par.qa}), which refers to the {\em final\/} state of the configurable parameters in the block.
\nxsubpoint\label{sp:conf.par.change}\emb{Changing the values of configurable parameters}
The procedure for changing the values of configurable parameters is defined in the code of smart contract~$\gamma$. For most configurable parameters, called {\em ordinary}, any validator may suggest a new value by sending a special message with the number of the parameter and its proposed value to~$\gamma$. If the suggested value is valid, further voting messages from the validators are collected by the smart contract, and if more than two-thirds each of the current and next sets of validators support the proposal, the value is changed.
Some parameters, such as the current set of validators, cannot be changed in this way. Instead, the current configuration contains a parameter with the address of smart contract $\nu$ responsible for electing the next set of validators, and smart contract $\gamma$ accepts messages only from this smart contract $\nu$ to modify the value of the configuration parameter containing the current set of validators.
\nxsubpoint\emb{Changing the validator election procedure}
If the validator election procedure ever needs to be changed, this can be accomplished by first committing a new validator election smart contract into the masterchain, and then changing the ordinary configurable parameter containing the address $\nu$ of the validator election smart contract. This will require two-thirds of the validators to accept the proposal in a vote as described above in~\ptref{sp:conf.par.change}.
\nxsubpoint\emb{Changing the procedure of changing configurable parameters}
Similarly, the address of the configuration smart contract itself is a configurable parameter and may be changed in this fashion. In this way, most fundamental parameters and smart contracts of the TON Blockchain may be modified in any direction agreed upon by the qualified majority of the validators.
\nxsubpoint\emb{Initial values of the configurable parameters}
The initial values of most configurable parameters appear in block zero of the masterchain as part of the masterchain's initial state, which is explicitly present with no omissions in this block. The code of all fundamental smart contracts is also present in the initial state. In this way, the original ``constitution'' and configuration of the TON Blockchain, including the original set of validators, is made explicit in block zero.
\mysubsection{New smart contracts and their addresses}\label{p:acc.create}
This section discusses the creation and initialization of new smart contracts---in particular, the origin of their initial code, persistent data, and balance. It also discusses the assignment of account addresses to new smart contracts.
\nxsubpoint\emb{Description valid only for masterchain and basic workchain}
The mechanisms for creating new smart contracts and assigning their addresses described in this section are valid only for the basic workchain and the masterchain. Other workchains may define their own mechanisms for dealing with these problems.
\nxsubpoint\label{sp:crypto.to.uninit}\emb{Transferring cryptocurrency to uninitialized accounts}
First of all, {\em it is possible to send messages, including value-bearing messages, to previously unmentioned accounts.} If an inbound message arrives at a shardchain with a destination address $\eta$ corresponding to an undefined account, it is processed by a transaction as if the code of the smart contract were empty (i.e., consisting of an implicit \texttt{RET}). If the message is value-bearing, this leads to the creation of an ``uninitialized account'', which may have a non-zero balance (if value-bearing messages have been sent to it),\footnote{Value-bearing messages with the {\tt bounce} flag set will not be accepted by an uninitialized account, but will be ``bounced'' back.} but has no code and no data. Because even an uninitialized account occupies some persistent storage (needed to hold its balance), some small persistent-storage payments will be exacted from time to time from the account's balance, until it becomes negative.
\nxsubpoint\label{sp:constr.msg}\emb{Initializing smart contracts by constructor messages}
An account, or smart contract, is created by sending a special {\em constructor message\/} $M$ to its address $\eta$. The body of such a message contains the tree of cells with the initial code of the smart contract (which may be replaced by its hash in some situations), and the initial data of the smart contract (maybe empty; it can be replaced by its hash). The hash of the code and of the data contained in the constructor message must coincide with the address $\eta$ of the smart contract; otherwise, it is rejected.
After the code and data of the smart contract are initialized from the body of the constructor message, the remainder of the constructor message is processed by a transaction (the {\em creating transaction} for smart contract $\eta$) by invoking TVM in a manner similar to that used for processing ordinary inbound messages.
\nxsubpoint\emb{Initial balance of a smart contract}
Notice that the constructor message usually must bear some value, which will be transferred to the balance of the newly-created smart contract; otherwise, the new smart contract would have a balance of zero and would not be able to pay for storing its code and data in the blockchain. The minimum balance required from a newly-created smart contract is a linear (more precisely, affine) function of the storage it uses. The coefficients of this function may depend on the workchain; in particular, they are higher in the masterchain than in the basic workchain.
\nxsubpoint\emb{Creating smart contracts by external constructor messages}
In some cases, it is necessary to create a smart contract by a constructor message that cannot bear any value---for instance, by a constructor message ``from nowhere'' (an external inbound message). Then one should first transfer a sufficient amount of funds to the uninitialized smart contract as explained in~\ptref{sp:crypto.to.uninit}, and only then send a constructor message ``from nowhere''.
\nxsubpoint\label{sp:ex.simple.wallet}\emb{Example: creating a cryptocurrency wallet smart contract}
An example of the above situation is provided by cryptocurrency wallet applications for human users, which must create a special wallet smart contract in the blockchain in which to keep the user's funds. This can be achieved as follows:
\begin{itemize}
\item The cryptocurrency wallet application generates a new cryptographic public/private key pair (typically for Ed25519 elliptic curve cryptography, supported by special TVM primitives) for signing the user's future transactions.
\item The cryptocurrency wallet application knows the code of the smart contract to be created (which typically is the same for all users), as well as the data, which typically consists of the public key of the wallet (or of its hash) and is generated at the very beginning. The hash of this information is the address~$\xi$ of the wallet smart contract to be created.
\item The wallet application may display the user's address $\xi$, and the user may start to receive funds to her uninitialized account $\xi$---for example, by buying some cryptocurrency at an exchange, or by asking a friend to transfer a small sum.
\item The wallet application can inspect the shardchain containing account $\xi$ (in the case of a basic workchain account) or the masterchain (in the case of a masterchain account), either by itself or using a blockchain explorer, and check the balance of~$\xi$.
\item If the balance is sufficient, the wallet application may create and sign (with the user's private key) the constructor message (``from nowhere''), and submit it for inclusion to the validators or the collators for the corresponding blockchain.
\item Once the constructor message is included into a block of the blockchain and processed by a transaction, the wallet smart contract is finally created.
\item When the user wants to transfer some funds to some other user or smart contract $\eta$, or wants to send a value-bearing message to $\eta$, she uses her wallet application to create the message $m$ that she wants her wallet smart contract $\xi$ to send to $\eta$, envelope $m$ into a special ``message from nowhere'' $m'$ with destination $\xi$, and sign $m'$ with her private key. Some provisions against replay attacks must be made, as explained in~\ptref{sp:msg.uniq}.
\item The wallet smart contract receives message $m'$ and checks the validity of the signature with the aid of the public key stored in its persistent data. If the signature is correct, it extracts embedded message $m$ from $m'$ and sends it to its intended destination $\eta$, with the indicated amount of funds attached to it.
\item If the user does not need to immediately start transferring funds, but only wants to passively receive some funds, she may keep her account uninitialized as long as she wants (provided the persistent storage payments do not lead to the exhaustion of its balance), thus minimizing the storage profile and persistent storage payments of the account.
\item Notice that the wallet application may create for the human user the illusion that the funds are kept in the application itself, and provide an interface to transfer funds or send arbitrary messages ``directly'' from the user's account~$\xi$. In reality, all these operations will be performed by the user's wallet smart contract, which effectively acts as a proxy for such requests. We see that a cryptocurrency wallet is a simple example of a {\em mixed\/} application, having an on-chain part (the wallet smart contract, used as a proxy for outbound messages) and an off-chain part (the external wallet application running on a user's device and keeping the private account key).
\end{itemize}
Of course, this is just one way of dealing with the simplest user wallet smart contracts. One can create multi-signature wallet smart contracts, or create a shared wallet with internal balances kept inside it for each of its individual users, and so on.
\nxsubpoint\emb{Smart contracts may be created by other smart contracts}
Notice that a smart contract may generate and send a constructor message while processing any transaction. In this way, smart contracts may automatically create new smart contracts, if they need to, without any human intervention.
\nxsubpoint\emb{Smart contracts may be created by wallet smart contracts}
On the other hand, a user may compile the code for her new smart contract~$\nu$, generate the corresponding constructor message~$m$, and use the wallet application to force her wallet smart contract~$\xi$ to send message $m$ to~$\nu$ with an adequate amount of funds, thus creating the new smart contract~$\nu$.
\mysubsection{Modification and removal of smart contracts}
This section explains how the code and state of a smart contract may be changed, and how and when a smart contract may be destroyed.
\nxsubpoint\emb{Modification of the data of a smart contract}
The persistent data of a smart contract is usually modified as a result of executing the code of the smart contract in TVM while processing a transaction, triggered by an inbound message to the smart contract. More specifically, the code of the smart contract has access to the old persistent storage of the smart contract via TVM control register \texttt{c4}, and may modify the persistent storage by storing another value into~\texttt{c4} before normal termination.
Normally, there are no other ways to modify the data of an existing smart contract. If the code of the smart contract does not provide any ways to modify the persistent data (e.g., if it is a simple wallet smart contract as described in~\ptref{sp:ex.simple.wallet}, which initializes the persistent data with the user's public key and does not intend to ever change it), then it will be effectively immutable---unless the code of the smart contract is modified first.
\nxsubpoint\emb{Modification of the code of a smart contract}
Similarly, the code of an existing smart contract may be modified only if some provisions for such an upgrade are present in the current code. The code is modified by invoking TVM primitive \texttt{SETCODE}, which sets the root of the code for the current smart contract from the top value in the TVM stack. The modification is applied only after the normal termination of the current transaction.
Typically, if the developer of a smart contract wants to be able to upgrade its code in the future, she provides a special ``code upgrade method'' in the original code of the smart contract, which invokes \texttt{SETCODE} in response to certain inbound ``code upgrade'' messages, using the new code sent in the message itself as an argument to \texttt{SETCODE}. Some provisions must be made to protect the smart contract from unauthorized replacement of the code; otherwise, control of the smart contract and the funds on its balance could be lost. For example, code upgrade messages might be accepted only from a trusted source address, or they might be protected by requiring a valid cryptographic signature and a correct sequence number.
\nxsubpoint\emb{Keeping the code or data of the smart contract outside the blockchain}
The code or data of the smart contract may be kept outside the blockchain and be represented only by their hashes. In such cases, only empty inbound messages may be processed, as well as messages carrying a correct copy of the smart-contract code (or its portion relevant for processing the specific message) and its data inside special fields. An example of such a situation is given by the uninitialized smart contracts and constructor messages described in~\ptref{p:acc.create}.
\nxsubpoint\emb{Using code libraries}
Some smart contracts may share the same code, but use different data. One example of this is wallet smart contracts (cf.~\ptref{sp:ex.simple.wallet}), which are likely to use the same code (throughout all wallets created by the same software), but with different data (because each wallet must use its own pair of cryptographic keys). In this case, the code for all the wallet smart contracts is best committed by the developer into a shared {\em library}; this library would reside in the masterchain, and be referred to by its hash using a special ``external library cell reference'' as the root of the code of each wallet smart contract (or as a subtree inside that code).
Notice that even if the library code becomes unavalable---for example, because its developer stops paying for its storage in the masterchain---it is still possible to use the smart contracts referring to this library, either by committing the library again into the masterchain, or by including its relevant parts inside a message sent to the smart contract. This external cell reference resolution mechanism is discussed in more detail later in~\ptref{sp:lib.env}.
\nxsubpoint\emb{Destroying smart contracts}
Notice that a smart contract cannot really be destroyed until its balance becomes zero or negative. It may become negative as a result of collecting persistent storage payments, or after sending a value-bearing outbound message transferring almost all of its previous balance.
For example, a user may decide to transfer all remaining funds from her wallet to another wallet or smart contract. This may be useful, for instance, if one wants to upgrade the wallet, but the wallet smart contract does not have any provisions for future upgrades; then one can simply create a new wallet and transfer all funds to it.
\nxsubpoint\emb{Frozen accounts}
When the balance of an account becomes non-positive after a transaction, or smaller than a certain workchain-dependent minimum, the account is {\em frozen\/} by replacing all its code and data by a single 32-byte hash. This hash is kept afterwards for some time (e.g., a couple of months) to prevent recreation of the smart contract by its original creating transaction (which still has the correct hash, equal to the account address), and to allow its owner to recreate the account by transferring some funds and sending a message containing the account's code and data, to be reinstated in the blockchain. In this respect, frozen accounts are similar to uninitialized accounts; however, the hash of the correct code and data for a frozen account is not necessarily equal to the account address, but is kept separately.
Notice that frozen accounts may have a negative balance, indicating that persistent storage payments are due. An account cannot be unfrozen until its balance becomes positive and larger than a prescribed minimum value.
\clearpage
\mysection{Message forwarding and delivery guarantees}
This chapter discusses the forwarding of messages inside the TON Blockchain, including the Hypercube Routing (HR) and Instant Hypercube Routing (IHR) protocols. It also describes the provisions required to implement the message delivery guarantees and the FIFO ordering guarantee.
\mysubsection{Message addresses and next-hop computation}
This section explains the computation of transit and next-hop addresses by the variant of the hypercube routing algorithm employed in TON Blockchain. The hypercube routing protocol itself, which uses the concepts and next-hop address computation algorithm introduced in this section, is presented in the next section.
\nxsubpoint\emb{Account addresses}
The {\em source address\/} and {\em destination address\/} are always present in any message. Normally, they are {\em (full) account addresses}. A full account address consists of a $\workchainid$ (a signed 32-bit big-endian integer defining a workchain), followed by a (usually) 256-bit {\em internal address\/} or {\em account identifier\/} $\accountid$ (which may also be interpreted as an unsigned big-endian integer) defining the account within the chosen workchain.
Different workchains may use account identifiers that are shorter or longer than the ``standard'' 256 bits used in the masterchain ($\workchainid=-1$) and in the basic workchain ($\workchainid=0$). To this end, the masterchain state contains a list of all workchains defined so far, along with their account identifier lengths. An important restriction is that the $\accountid$ for any workchain must be at least 64 bits long.
In what follows, we often consider only the case of 256-bit account addresses for simplicity. Only the first 64 bits of the $\accountid$ are relevant for the purposes of message routing and shardchain splitting.
\nxsubpoint\emb{Source and destination addresses of a message}
Any message has both a {\em source address\/} and a {\em destination address}. Its source address is the address of the account (smart contract) that has created the message while processing some transaction; the source address cannot be changed or set arbitrarily, and smart contracts heavily rely on this property. By contrast, when a message is created, any well-formed destination address may be chosen; after that, the destination address cannot be changed.
\nxsubpoint\emb{External messages with no source or destination address}
Some messages can have no source or no destination address (though at least one of them must be present), as indicated by special flags in the message header. Such messages are the {\em external messages} intended for the interaction of the TON Blockchain with the outside world---human users and their cryptowallet applications, off-chain and mixed applications and services, other blockchains, and so on.
External messages are never routed inside the TON Blockchain. Instead, ``messages from nowhere'' (i.e., with no source address) are directly included into the \textit{InMsgDescr\/} of a destination shardchain block (provided some conditions are met) and processed by a transaction in that very block. Similarly, ``messages to nowhere'' (i.e., with no TON Blockchain destination address), also known as {\em log messages}, are also present only in the block containing the transaction that generated such a message.\footnote{``Messages to nowhere'' may have some special fields in their body indicating their destination outside the TON Blockchain---for instance, an account in some other blockchain, or an IP address and port---which may be interpreted by the third-party software appropriately. Such fields are ignored by the TON Blockchain.}
Therefore, external messages are almost irrelevant for the discussion of message routing and message delivery guarantees. In fact, the message delivery guarantees for outbound external messages are trivial (at most, the message must be included into the \textit{LogMsg} part of the block), and for inbound external messages there are none, since the validators of a shardchain block are free to include or ignore suggested inbound external messages at their discretion (e.g., according to the processing fee offered by the message).\footnote{The problem of bypassing possible validator censorship---which could happen, for instance, if all validators conspire not to include external messages sent to accounts belonging to some set of blacklisted accounts---is dealt with separately elsewhere. The main idea is that the validators may be forced to promise to include a message with a known hash in a future block, without knowing anything about the identity of the sender or the receiver; they will have to keep this promise afterwards when the message itself with pre-agreed hash is presented.}
In what follows, we focus on ``usual'' or ``internal'' messages, which have both a source and a destination address.
\nxsubpoint\emb{Transit and next-hop addresses}
When a message needs to be routed through intermediate shardchains before reaching its intended destination, it is assigned a {\em transit address\/} and a {\em next-hop address\/} in addition to the (immutable) source and destination addresses. When a copy of the message resides inside a transit shardchain awaiting its relay to its next hop, the {\em transit address\/} is its intermediate address lying in the transit shardchain, as if belonging to a special message-relay smart contract whose only job is to relay the unchanged message to the next shardchain on the route. The {\em next-hop address\/} is the address in a neighboring shardchain (or, on some rare occasions, in the same shardchain) to which the message needs to be relayed. After the message is relayed, the next-hop address usually becomes the transit address of the copy of the message included in the next shardchain.
Immediately after an outbound message is created in a shardchain (or in the masterchain), its transit address is set to its source address.\footnote{However, the internal routing process described in~\ptref{sp:hr.int.route} is applied immediately after that, which may further modify the transit address.}
\nxsubpoint\label{sp:hr.next.hop}\emb{Computation of the next-hop address for hypercube routing}
The TON Blockchain employs a variant of hypercube routing. This means that the next-hop address is computed from the transit address (originally equal to the source address) as follows:
\begin{enumerate}
\item The (big-endian signed) 32-bit $\workchainid$ components of both the transit address and destination address are split into groups of $n_1$ bits (currently, $n_1=32$), and they are scanned from the left (i.e., the most significant bits) to the right. If one of the groups in the transit address differs from the corresponding group in the destination address, then the value of this group in the transit address is replaced by its value in the destination address to compute the next-hop address.
\item If the $\workchainid$ parts of the transit and destination addresses match, then a similar process is applied to the $\accountid$ parts of the addresses: The $\accountid$ parts, or rather their first (most significant) 64 bits, are split into groups of $n_2$ bits (currently, $n_2=4$ bit groups are used, corresponding to the hexadecimal digits of the address) starting from the most significant bit, and are compared starting from the left. The first group that differs is replaced in the transit address with its value in the destination address to compute the next-hop address.
\item If the first 64 bits of the $\accountid$ parts of the transit and destination addresses match as well, then the destination account belongs to the current shardchain, and the message should not be forwarded outside the current shardchain at all. Instead, it must be processed by a transaction inside it.
\end{enumerate}
\nxsubpoint\label{sp:nh.notat}\emb{Notation for the next-hop address}
We denote by
\begin{equation}
\NextHop(\xi,\eta)
\end{equation}
the next-hop address computed for current (source or transit) address $\xi$ and destination address $\eta$.
\nxsubpoint\label{sp:nh.anycast}\emb{Support for anycast addresses}
``Large'' smart contracts, which can have separate instances in different shardchains, may be reached using {\em anycast destination addresses}. These addresses are supported as follows.
An anycast address $(\eta,d)$ consists of a usual address $\eta$ along with its ``splitting depth'' $d\leq 31$. The idea is that the message may be delivered to any address differing from $\eta$ only in the first $d$ bits of the internal address part (i.e., not including the workchain identifier, which must match exactly). This is achieved as follows:
\begin{itemize}
\item The effective destination address $\tilde\eta$ is computed from $(\eta,d)$ by replacing the first $d$ bits of the internal address part of $\eta$ with the corresponding bits taken from the source address $\xi$.
\item All computations of $\NextHop(\nu,\eta)$ are replaced by $\NextHop(\nu,\tilde\eta)$, for $\nu=\xi$ as well as for all other intermediate addresses $\nu$. In this way, Hypercube Routing or Instant Hypercube Routing will ultimately deliver the message to the shardchain containing $\tilde\eta$.
\item When the message is processed in its destination shardchain (the one containing address $\tilde\eta$), it may be processed by an account $\eta'$ of the same shardchain differing from $\eta$ and $\tilde\eta$ only in the first $d$ bits of the internal address part. More precisely, if the common shard address prefix is $s$, so that only internal addresses starting with binary string $s$ belong to the destination shard, then $\eta'$ is computed from $\eta$ by replacing the first $\min(d,|s|)$ bits of the internal address part of $\eta$ with the corresponding bits of~$s$.
\end{itemize}
That said, we tacitly ignore the existence of anycast addresses and the additional processing they require in the following discussions.
\nxsubpoint\label{sp:nh.hamming.opt}\emb{Hamming optimality of the next-hop address algorithm}
Notice that the specific hypercube routing next-hop computation algorithm explained in~\ptref{sp:hr.next.hop} may potentially be replaced by another algorithm, provided it satisfies certain properties. One of these properties is the {\em Hamming optimality}, meaning that the Hamming ($L_1$) distance from $\xi$ to $\eta$ equals the sum of Hamming distances from $\xi$ to $\NextHop(\xi,\eta)$ and from $\NextHop(\xi,\eta)$ to $\eta$:
\begin{equation}\label{eq:hamm.opt}
{\|\xi-\eta\|}_1=\bigl\|\xi-\NextHop(\xi,\eta)\bigr\|_1+\bigl\|\NextHop(\xi,\eta)-\eta\bigr\|_1
\end{equation}
Here ${\|\xi-\eta\|}_1$ is the {\em Hamming distance\/} between $\xi$ and $\eta$, equal to the number of bit positions in which $\xi$ and $\eta$ differ:\footnote{When the addresses involved are of different lengths (e.g., because they belong to different workchains), one should consider only the first 96 bits of the addresses in the above formula.}
\begin{equation}
{\|\xi-\eta\|}_1=\sum_i|\xi_i-\eta_i|
\end{equation}
Notice that in general one should expect only an inequality in \eqref{eq:hamm.opt}, following from the triangle inequality for the $L_1$-metric. Hamming optimality essentially means that $\NextHop(\xi,\eta)$ lies on one of the (Hamming) shortest paths from $\xi$ to $\eta$. It can also be expressed by saying that $\nu=\NextHop(\xi,\eta)$ is always obtained from $\xi$ by changing the values of bits at some positions to their values in $\eta$: for any bit position $i$, we have $\nu_i=\xi_i$ or $\nu_i=\eta_i$.\footnote{Instead of Hamming optimality, we might have considered the equivalent property of {\em Kademlia optimality}, written for the Kademlia (or weighted $L_1$) distance as given by $\|\xi-\eta\|_K:=\sum_i2^{-i}|\xi_i-\eta_i|$ instead of the Hamming distance.}
\nxsubpoint\emb{Non-stopping of $\NextHop$}
Another important property of the $\NextHop$ is its {\em non-stopping}, meaning that $\NextHop(\xi,\eta)=\xi$ is possible only when $\xi=\eta$. In other words, if we have not yet arrived at $\eta$, the next hop cannot coincide with our current position.
This property implies that the path from $\xi$ to $\eta$---i.e., the sequence of intermediate addresses $\xi^{(0)}:=\xi$, $\xi^{(n)}:=\NextHop(\xi^{(n-1)},\eta)$---will gradually stabilize at $\eta$: for some $N\geq0$, we have $\xi^{(n)}=\eta$ for all $n\geq N$. Indeed, one can always take $N:={\|\xi-\eta\|}_1$.
\nxsubpoint\label{sp:path.conv}\emb{Convexity of the HR path with respect to sharding}
A consequence of Hamming optimality property~\eqref{eq:hamm.opt} is what we call the {\em convexity\/} of the path from $\xi$ to $\eta$ with respect to sharding. Namely, if $\xi^{(0)}:=\xi$, $\xi^{(n)}:=\NextHop(\xi^{(n-1)},\eta)$ is the computed path from $\xi$ to $\eta$, and $N$ is the first index such that $\xi^{(N)}=\eta$, and $S$ is a shard of some workchain in any shard configuration, then the indices $i$ with $\xi^{(i)}$ residing in shard~$S$ constitute a subinterval in $[0,N]$. In other words, if integers $0\leq i\leq j\leq k\leq N$ are such that $\xi^{(i)}$, $\xi^{(k)}\in S$, then $\xi^{(j)}\in S$ as well.
This convexity property is important for some proofs related to message forwarding in the presence of dynamic sharding.
\nxsubpoint\label{sp:hr.int.route}\emb{Internal routing}
Notice that the next-hop address computed according to the rules defined in~\ptref{sp:hr.next.hop} may belong to the same shardchain as the current one (i.e., the one containing the transit address). In that case, the ``internal routing'' occurs immediately, the transit address is replaced by the value of the computed next-hop address, and the next-hop address computation step is repeated until a next-hop address lying outside the current shardchain is obtained. The message is then kept in the transit output queue according to its computed next-hop address, with its last computed transit address as the ``intermediate owner'' of the transit message. If the current shardchain splits into two shardchains before the message is forwarded further, it is the shardchain containing the intermediate owner that inherits this transit message.
Alternatively, we might go on computing the next-hop addresses only to find out that the destination address already belongs to the current shardchain. In that case, the message will be processed (by a transaction) inside this shardchain instead of being forwarded further.
\nxsubpoint\emb{Neighboring shardchains}
Two shards in a shard configuration---or the two corresponding shardchains---are said to be {\em neighbors}, or {\em neighboring shardchains}, if one of them contains a next-hop address for at least one combination of allowed source and destination addresses, while the other contains the transit address for the same combination. In other words, two shardchains are neighbors if a message can be forwarded directly from one of them into the other via Hypercube Routing.
The masterchain is also included in this definition, as if it were the only shardchain of the workchain with $\workchainid=-1$. In this respect, it is a neighbor of all the other shardchains.
\nxsubpoint\emb{Any shard is a neighbor of itself}
Notice that a shardchain is always considered a neighbor of itself. This may seem redundant, because we always repeat the next-hop computation described in~\ptref{sp:hr.next.hop} until we obtain a next-hop address outside the current shardchain (cf.~\ptref{sp:hr.int.route}). However, there are at least two reasons for such an arrangement:
\begin{itemize}
\item Some messages have the source and the destination address inside the same shardchain, at least when the message is created. However, if such a message is not processed immediately in the same block where it has been created, it must be added to the outbound message queue of its shardchain, and be imported as an inbound message (with an entry in the {\em InMsgDescr}) in one of the subsequent blocks of the same shardchain.\footnote{Notice that the next-hop and internal-routing computations are still applied to such messages, since the current shardchain may be split before the message is processed. In this case, the new sub-shardchain containing the destination address will inherit the message.}
\item Alternatively, the next-hop address may originally be in some other shardchain that later gets merged with the current shardchain, so that the next hop becomes inside the same shardchain. Then the message will have to be imported from the outbound message queue of the merged shardchain, and forwarded or processed accordingly to its next-hop address, even though they reside now inside the same shardchain.
\end{itemize}
\nxsubpoint\label{sp:isp.hr}\emb{Hypercube Routing and the ISP}
Ultimately, the Infinite Sharding Paradigm (ISP) applies here: a shardchain should be considered a provisional union of accountchains, grouped together solely to minimize the block generation and transmission overhead.
The forwarding of a message runs through several intermediate account\-chains, some of which can happen to lie in the same shard. In this case, once a message reaches an accountchain lying in this shard, it is immediately (``internally'') routed inside that shard until the last accountchain lying in the same shard is reached (cf.~\ptref{sp:hr.int.route}). Then the message is enqueued in the output queue of that last accountchain.\footnote{We may define the (virtual) output queue of an account(chain) as the subset of the {\em OutMsgQueue\/} of the shard currently containing that account that consists of messages with transit addresses equal to the address of the account.}
\nxsubpoint\label{sp:repr.interm.addr}\emb{Representation of transit and next-hop addresses}
Notice that the transit and next-hop addresses differ from the source address only in the $\workchainid$ and in the first (most significant) 64 bits of the account address. Therefore, they may be represented by 96-bit strings. Furthermore, their $\workchainid$ usually coincides with the $\workchainid$ of either the source address or the destination address; a couple of bits may be used to indicate this situation, thus further reducing the space required to represent the transit and next-hop addresses.
In fact, the required storage may be reduced even further by observing that the specific hypercube routing algorithm described in~\ptref{sp:hr.next.hop} always generates intermediate (i.e., transit and next-hop) addresses that coincide with the destination address in their first $k$ bits, and with the source address in their remaining bits. Therefore, one might use just the values $0\leq k_{\text{tr}},k_{\text{nh}}\leq 96$ to fully specify the transit and next-hop addresses. One might also notice that $k':=k_{\text{nh}}$ turns out to be a fixed function of $k:=k_{\text{tr}}$ (for instance, $k'=k+n_2=k+4$ for $k\geq32$), and therefore include only one 7-bit value of~$k$ in the serialization.
Such optimizations have the obvious disadvantage that they rely too much on the specific routing algorithm used, which can be changed in the future, so they are used in~\ptref{sp:tl.msg.env} with a provision to specify more general intermediate addresses if necessary.
\nxsubpoint\label{sp:msg.env}\emb{Message envelopes}
The transit and next-hop addresses of a forwarded message are not included in the message itself, but are kept in a special {\em message envelope}, which is a cell (or a cell slice) containing the transit and next-hop addresses with the above optimizations, some other information relevant for forwarding and processing, and a reference to a cell containing the unmodified original message. In this way, a message can easily be ``extracted'' from its original envelope (e.g., the one present in the {\em InMsgDescr}) and be put into another envelope (e.g., before being included into the {\em OutMsgQueue}).
In the representation of a block as a tree, or rather a DAG, of cells, the two different envelopes will contain references to a shared cell with the original message. If the message is large, this arrangement avoids the need to keep more than one copy of the message in the block.
\mysubsection{Hypercube Routing protocol}
This section exposes the details of the hypercube routing protocol employed by the TON Blockchain to achieve guaranteed delivery of messages between smart contracts residing in arbitrary shardchains. For the purposes of this document, we will refer to the variant of hypercube routing employed by the TON Blockchain as Hypercube Routing (HR).
\nxsubpoint\label{sp:msg.uniq}\emb{Message uniqueness}
Before continuing, let us observe that any (internal) message is {\em unique}. Recall that a message contains its full source address along with its logical creation time, and all outbound messages created by the same smart contract have strictly increasing logical creation times (cf.~\ptref{sp:lt.ton.blkch}); therefore, the combination of the full source address and the logical creation time uniquely defines the message. Since we assume the chosen hash function $\Sha$ to be collision resistant, {\em a message is uniquely determined by its hash}, so we can identify two messages if we know that their hashes coincide.
This does not extend to external messages ``from nowhere'', which have no source addresses. Special care must be taken to prevent replay attacks related to such messages, especially by designers of user wallet smart contracts. One possible solution is to include a sequence number in the body of such messages, and keep the count of external messages already processed inside the smart-contract persistent data, refusing to process an external message if its sequence number differs from this count.
\nxsubpoint\label{sp:msg.hash.ident}\emb{Identifying messages with equal hashes}
The TON Blockchain assumes that two messages with the same hashes coincide, and treats either of them as a redundant copy of the other. As explained above in~\ptref{sp:msg.uniq}, this does not lead to any unexpected effects for internal messages. However, if one sends two coinciding ``messages from nowhere'' to a smart contract, it may happen that only one of them will be delivered---or both. If their action is not supposed to be idempotent (i.e., if processing the message twice has a different effect from processing it once), some provisions should be made to distinguish the two messages, for instance by including a sequence number in them.
In particular, the {\em InMsgDescr\/} and {\em OutMsgDescr\/} use the (unenveloped) message hash as a key, tacitly assuming that distinct messages have distinct hashes. In this way, one can trace the path and the fate of a message across different shardchains by looking up the message hash in the {\em InMsgDescr\/} and {\em OutMsgDescr\/} of different blocks.
\nxsubpoint\label{sp:out.msg.q}\emb{The structure of {\em OutMsgQueue}}
Recall that the outbound messages --- both those created inside the shardchain, and transit messages previously imported from a neighboring shardchain to be relayed to the next-hop shardchain --- are accumulated in the {\em OutMsgQueue}, which is part of the {\em state\/} of the shardchain (cf.~\ptref{sp:outmsgq}). In contrast with {\em InMsgDescr\/} and {\em OutMsgDescr}, the key in {\em OutMsgQueue} is not the message hash, but its next-hop address---or at least its first 96 bits---concatenated with the message hash.
Furthermore, the {\em OutMsgQueue\/} is not just a dictionary (hashmap), mapping its keys into (enveloped) messages. Rather, it is a {\em min-augmented dictionary with respect to the logical creation time}, meaning that each node of the Patricia tree representing {\em OutMsgQueue\/} has an additional value (in this case, an unsigned 64-bit integer), and that this augmentation value in each fork node is set to be equal to the minimum of the augmentation values of its children. The augmentation value of a leaf equals the logical creation time of the message contained in that leaf; it need not be stored explicitly.
\nxsubpoint\emb{Inspecting the {\em OutMsgQueue\/} of a neighbor}
Such a structure for the {\em OutMsgQueue\/} enables the validators of a neighboring shardchain to inspect it to find its part (Patricia subtree) relevant to them (i.e., consisting of messages with the next-hop address belonging to the neighboring shard in question---or having the next-hop address with a given binary prefix), as well as quickly compute the ``oldest'' (i.e., with the minimum logical creation time) message in that part.
Furthermore, the shard validators do not even need to track the total state of all their neighboring shardchains---they only need to keep and update a copy of their {\em OutMsgQueue}, or even of its subtree related to them.
\nxsubpoint\label{sp:monot.import}
\emb{Logical time monotonicity: importing the oldest message from the neighbors}
The first fundamental local condition of message forwarding, called {\em (message import) (logical time) monotonicity condition}, may be summarized as follows:
\begin{quote}
While importing messages into the {\em InMsgDescr\/} of a shardchain block from the {\em OutMsgQueue\/}s of its neighboring shardchains, the validators must import the messages in the increasing order of their logical time; in the case of a tie, the message with the smaller hash is imported first.
\end{quote}
More precisely, each shardchain block contains the hash of a masterchain block (assumed to be ``the latest'' masterchain block at the time of the shardchain block's creation), which in turn contains the hashes of the most recent shardchain blocks. In this way, each shardchain block indirectly ``knows'' the most recent state of all other shardchains, and especially its neighboring shardchains, including their {\em OutMsgQueue\/}s.\footnote{In particular, if the hash of a recent block of a neighboring shardchain is not yet reflected in the latest masterchain block, its modifications to {\em OutMsgQueue\/} must not be taken into account.}
Now an alternative equivalent formulation of the monotonicity condition is as follows:
\begin{quote}
If a message is imported into the {\em InMsgDescr\/} of the new block, its logical creation time cannot be greater than that of any message left unimported in the {\em OutMsgQueue\/} of the most recent state of any of the neighboring shardchains.
\end{quote}
It is this form of the monotonicity condition that appears in the local consistency conditions of the TON Blockchain blocks and is enforced by the validators.
\nxsubpoint\emb{Witnesses to violations of the message import logical time monotonicity condition}
Notice that if this condition is not fulfilled, a small Merkle proof witnessing its failure may be constructed. Such a proof will contain:
\begin{itemize}
\item A path in the {\em OutMsgQueue\/} of a neighbor from the root to a certain message $m$ with small logical creation time.
\item A path in the {\em InMsgDescr\/} of the block under consideration showing that the key equal to $\Hash(m)$ is absent in {\em InMsgDescr} (i.e., that $m$ has not been included in the current block).
\item A proof that $m$ has not been included in a preceding block of the same shardchain, using the block header information containing the smallest and the largest logical time of all messages imported into the block (cf. \ptref{sp:msg.deliver.chk}--\ptref{sp:hr.ihr.deliver.chk} for more information).
\item A path in {\em InMsgDescr\/} to another included message $m'$, such that either $\LT(m')>\LT(m)$, or $\LT(m')=\LT(m)$ and $\Hash(m')>\Hash(m)$.
\end{itemize}
\nxsubpoint\label{sp:omsgq.del}\emb{Deleting a message from {\em OutMsgQueue}}
A message must be deleted from {\em OutMsgQueue\/} sooner or later; otherwise, the storage used by {\em OutMsgQueue\/} would grow to infinity. To this end, several ``garbage collection rules'' are introduced. They allow the deletion of a message from {\em OutMsgQueue\/} during the evaluation of a block only if an explicit special ``delivery record'' is present in the {\em OutMsgDescr\/} of that block. This record contains either a reference to the neighboring shardchain block that has included the message into its {\em InMsgDescr\/} (the hash of the block is sufficient, but collated material for the block may contain the relevant Merkle proof), or a Merkle proof of the fact that the message has been delivered to its final destination via Instant Hypercube Routing.
\nxsubpoint\emb{Guaranteed message delivery via Hypercube Routing}
In this way, a message cannot be deleted from the outbound message queue unless it has been either relayed to its next-hop shardchain or delivered to its final destination (cf.~\ptref{sp:omsgq.del}). Meanwhile, the message import monotonicity condition (cf.~\ptref{sp:monot.import}) ensures that any message will sooner or later be relayed into the next shardchain, taking into account other conditions which require the validators to use at least half of the block's space or gas limits for importing inbound internal messages (otherwise the validators might choose to create empty blocks or import only external messages even in the presence of non-empty outbound message queues at their neighbors).
\nxsubpoint\label{sp:msg.proc.order}\emb{Message processing order}
When several imported messages are processed by transactions inside a block, the {\em message processing order conditions\/} ensure that older messages are processed first. More precisely, if a block contains two transactions $t$ and $t'$ of the same account, which process inbound messages $m$ and $m'$, respectively, and $\LT(m)<\LT(m')$, then we must have $\LT(t)<\LT(t')$.
\nxsubpoint\emb{FIFO guarantees of Hypercube Routing}
The message processing order conditions (cf.~\ptref{sp:msg.proc.order}), along with the message import monotonicity conditions (cf.~\ptref{sp:monot.import}), imply the {\em FIFO guarantees for Hypercube Routing}. Namely, if a smart contract $\xi$ creates two messages $m$ and $m'$ with the same destination $\eta$, and $m'$ is generated later than $m$ (meaning that $m\prec m'$, hence $\LT(m)<\LT(m')$), then $m$ will be processed by $\eta$ before $m'$. This is so because both messages will follow the same routing steps on the path from $\xi$ to $\eta$ (the Hypercube Routing algorithm described in~\ptref{sp:hr.next.hop} is deterministic), and in all outbound queues and inbound message descriptions $m'$ will appear ``after'' $m$.\footnote{This statement is not as trivial as it seems at first, because some of the shardchains involved may split or merge during the routing. A correct proof may be obtained by adopting the ISP perspective to HR as explained in~\ptref{sp:isp.hr} and observing that $m'$ will always be behind $m$, either in terms of the intermediate accountchain reached or, if they happen to be in the same accountchain, in terms of logical creation time.
A crucial observation is that ``at any given moment of time'' (logically; a more precise description would be ``in the total state obtained after processing any causally closed subset $\cF$ of blocks''), the intermediate accountchains belonging to the same shard are contiguous on the path from $\xi$ to~$\eta$ (i.e., cannot have accountchains belonging to some other shard in between). This is a ``convexity property'' (cf.~\ptref{sp:path.conv}) of the Hypercube Routing algorithm described in~\ptref{sp:hr.next.hop}.}
If message $m'$ can be delivered to $B$ via Instant Hypercube Routing, this is not necessarily true anymore. Therefore, a simple way of ensuring FIFO message delivery discipline between a pair of smart contracts consists in setting a special bit in the message header preventing its delivery via IHR.
\nxsubpoint\emb{Delivery uniqueness guarantees of Hypercube Routing}
Notice that the message import monotonicity conditions also imply the {\em uniqueness\/} of the delivery of any message via Hypercube Routing---i.e., that it cannot be imported and processed by the destination smart contract more than once. We will see later in~\ptref{p:ihr.combine.deliv} that enforcing delivery uniqueness when both Hypercube Routing and Instant Hypercube Routing are active is more complicated.
\nxsubpoint\label{sp:hr.overview}\emb{An overview of Hypercube Routing}
Let us summarize all routing steps performed to deliver an internal message $m$ created by source account $\xi_0$ to destination account~$\eta$. We denote by $\xi_{k+1}:=\NextHop(\xi_k,\eta)$, $k=0,1,2,\ldots$ the intermediate addresses dictated by HR for forwarding the message $m$ to its final destination $\eta$. Let $S_k$ be the shard containing $\xi_k$.
\begin{itemize}
\item{[Birth]} --- Message $m$ with destination $\eta$ is created by a transaction $t$ belonging to an account $\xi_0$ residing in some shardchain $S_0$. The logical creation time $\LT(m)$ is fixed at this point and included into the message $m$.
\item{[ImmediateProcessing?]} --- If the destination $\eta$ resides in the same shardchain $S_0$, the message may be processed in the same block it was generated in. In this case, $m$ is included into {\em OutMsgDescr\/} with a flag indicating it has been processed in this very block and need not be forwarded further. Another copy of $m$ is included into {\em InMsgDescr}, along with the usual data describing the processing of inbound messages. (Notice that $m$ is not included into the {\em OutMsgQueue\/} of $S_0$.)
\item{[InitialInternalRouting]} --- If $m$ either has a destination outside $S_0$, or is not processed in the same block where it was generated, the internal routing procedure described in~\ptref{sp:hr.int.route} is applied, until an index $k$ is found such that $\xi_k$ lies in $S_0$, but $\xi_{k+1}=\NextHop(\xi_k,\eta)$ does not (i.e., $S_k=S_0$, but $S_{k+1}\neq S_0$). Alternatively, this process stops if $\xi_k=\eta$ or $\xi_k$ coincides with $\eta$ in its first 96 bits.
\item{[OutboundQueuing]} --- The message $m$ is included into {\em OutMsgDescr\/} (with the key equal to its hash), with an envelope containing its transit address $\xi_k$ and next-hop address $\xi_{k+1}$ as explained in \ptref{sp:msg.env} and~\ptref{sp:repr.interm.addr}. The same enveloped message is also included in the {\em OutMsgQueue\/} of the state of $S_k$, with the key equal to the concatenation of the first 96 bits of its next-hop address $\xi_{k+1}$ (which may be equal to $\eta$ if $\eta$ belongs to $S_k$) and the message hash $\Hash(m)$.
\item{[QueueWait]} --- Message $m$ waits in the {\em OutMsgQueue\/} of shardchain $S_k$ to be forwarded further. In the meantime, shardchain $S_k$ may split or merge with other shardchains; in that case, the new shard $S'_k$ containing the transit address $\xi_k$ inherits $m$ in its {\em OutMsgQueue}.
\item{[ImportInbound]} --- At some point in the future, the validators for the shardchain $S_{k+1}$ containing the next-hop address $\xi_{k+1}$ scan the {\em OutMsgQueue\/} in the state of shardchain $S_k$ and decide to import message $m$ in keeping with the monotonicity condition (cf.~\ptref{sp:monot.import}) and other conditions. A new block for shardchain $S_{k+1}$ is generated, with an enveloped copy of $m$ included in its {\em InMsgDescr}. The entry in {\em InMsgDescr\/} contains also the {\em reason\/} for importing $m$ into this block, with a hash of the most recent block of shardchain $S'_k$, and the previous next-hop and transit addresses $\xi_k$ and $\xi_{k+1}$, so that the corresponding entry in the {\em OutMsgQueue\/} of $S'_k$ can be easily located.
\item{[Confirmation]} --- This entry in the {\em InMsgDescr\/} of $S_{k+1}$ also serves as a confirmation for $S'_k$. In a later block of $S'_k$, message $m$ must be removed from the {\em OutMsgQueue\/} of $S'_k$; this modification is reflected in a special entry in the {\em OutMsgDescr\/} of the block of $S'_k$ that performs this state modification.
\item{[Forwarding?]} --- If the final destination $\eta$ of~$m$ does not reside in $S_{k+1}$, the message is {\em forwarded}. Hypercube Routing is applied until some $\xi_l$, $l>k$, and $\xi_{l+1}=\NextHop(\xi_l,\eta)$ are obtained, such that $\xi_l$ lies in $S_{k+1}$, but $\xi_{l+1}$ does not (cf.~\ptref{sp:hr.int.route}). After that, a newly-enveloped copy of $m$ with transit address set to $\xi_l$ and next-hop address $\xi_{l+1}$ is included into both the {\em OutMsgDescr\/} of the current block of $S_{k+1}$ and the {\em OutMsgQueue\/} of the new state of $S_{k+1}$. The entry of $m$ in {\em InMsgDescr\/} contains a flag indicating that the message has been forwarded; the entry in {\em OutMsgDescr\/} contains the newly-enveloped message and a flag indicating that this is a forwarded message. Then all the steps starting from [OutboundQueueing] are repeated, for $l$ instead of~$k$.
\item{[Processing?]} --- If the final destination $\eta$ of $m$ resides in $S_{k+1}$, then the block of $S_{k+1}$ that imported the message must process it by a transaction $t$ included in the same block. In this case, {\em InMsgDescr\/} contains a reference to $t$ by its logical time $\LT(t)$, and a flag indicating that the message has been processed.
\end{itemize}
The above message routing algorithm does not take into account some further modifications required to implement Instant Hypercube Routing (IHR). For instance, a message may be {\em discarded\/} after being imported (listed in {\em InMsgDescr}) into its final or intermediate shardchain block, because a proof of delivery via IHR to the final destination is presented. In this case, such a proof must be included into {\em InMsgDescr\/} to explain why the message was not forwarded further or processed.
\mysubsection{Instant Hypercube Routing and combined delivery guarantees}\label{p:ihr.combine.deliv}
This section describes the Instant Hypercube Routing protocol, normally applied by TON Blockchain in parallel to the previously discussed Hypercube Routing protocol to achieve faster message delivery. However, when both Hypercube Routing and Instant Hypercube Routing are applied to the same message in parallel, achieving delivery and unique delivery guarantees is more complicated. This topic is also discussed in this section.
\nxsubpoint\label{sp:ihr.overview}\emb{An overview of Instant Hypercube Routing}
Let us explain the major steps applied when the Instant Hypercube Routing (IHR) mechanism is applied to a message. (Notice that normally both the usual HR and IHR work in parallel for the same message; some provisions must be taken to guarantee the uniqueness of delivery of any message.)
Consider the routing and delivery of the same message $m$ with source $\xi$ and destination $\eta$ as discussed in~\ptref{sp:hr.overview}:
\begin{itemize}
\item{[NetworkSend]} --- After the validators of $S_0$ have agreed on and signed the block containing the creating transaction $t$ for $m$, and observed that the destination $\eta$ of $m$ does not reside inside $S_0$, they may send a datagram (encrypted network message), containing the message $m$ along with a Merkle proof of its inclusion into the {\em OutMsgDescr\/} of the block just generated, to the validator group of the shardchain $T$ currently owning the destination~$\eta$.
\item{[NetworkReceive]} --- If the validators of shardchain $T$ receive such a message, they check its validity starting from the most recent masterchain block and the shardchain block hashes listed in it, including the most recent ``canonical'' block of shardchain $S_0$ as well. If the message is invalid, they silently discard it. If that block of shardchain $S_0$ has a larger sequence number than the one listed in the most recent masterchain block, they may either discard it or postpone the verification until the next masterchain block appears.
\item{[InclusionConditions]} --- The validators check inclusion conditions for message $m$. In particular, they must check that this message has not been delivered before, and that the {\em OutMsgQueue\/}s of the neighbors do not have unprocessed outbound messages with destinations in $T$ with smaller logical creation times than $\LT(m)$.
\item{[Deliver]} --- The validators deliver and process the message, by including it into the {\em InMsgDescr\/} of the current shardchain block along with a bit indicating that it is an IHR message, the Merkle proof of its inclusion into the {\em OutMsgDescr\/} of the original block, and the logical time of the transaction $t'$ processing this inbound message into the currently generated block.
\item{[Confirm]} --- Finally, the validators send encrypted datagrams to all the validator groups of the intermediate shardchains on the path from $\xi$ to $\eta$, containing a Merkle proof of the inclusion of message $m$ into the {\em InMsgDescr\/} of its final destination. The validators of an intermediate shardchain may use this proof to {\em discard\/} the copy of message $m$ travelling by the rules of HR, by importing the message into their {\em InMsgDescr\/} along with the Merkle proof of final delivery and setting a flag indicating that the message has been discarded.
\end{itemize}
The overall procedure is even simpler than that for Hypercube Routing. Notice, however, that IHR comes with no delivery or FIFO guarantees: the network datagram may be lost in transit, or the validators of the destination shardchain may decide not to act on it, or they may discard it due to buffer overflow. This is the reason why IHR is used as a complement to HR, and not as a replacement.
\nxsubpoint\emb{Overall eventual delivery guarantees}
Notice that the combination of HR and IHR guarantees the ultimate delivery of any internal message to its final destination. Indeed, the HR by itself is guaranteed to deliver any message eventually, and the HR for message $m$ can be cancelled at an intermediate stage only by a Merkle proof of delivery of $m$ to its final destination (via IHR).
\nxsubpoint\emb{Overall unique delivery guarantees}
However, the {\em uniqueness\/} of message delivery for the combination of HR and IHR is more difficult to achieve. In particular, one must check the following conditions, and, if necessary, be able to provide short Merkle proofs that they do or don't hold:
\begin{itemize}
\item When a message $m$ is imported into its next intermediate shardchain block via HR, we must check that $m$ has not already been imported via HR.
\item When $m$ is imported and processed in its final destination shardchain, we must check that $m$ has not already been processed. If it has, there are three subcases:
\begin{itemize}
\item If $m$ is being considered for import via HR, and it has already been imported via HR, it must not be imported at all.
\item If $m$ is being considered for import via HR, and it has already been imported via IHR (but not HR), then it must be imported and immediately discarded (without being processed by a transaction). This is necessary to remove $m$ from the {\em OutMsgQueue\/} of its previous intermediate shardchain.
\item If $m$ is being considered for import via IHR, and it has already been imported via either IHR or HR, it must not be imported at all.
\end{itemize}
\end{itemize}
\nxsubpoint\label{sp:msg.deliver.chk}\emb{Checking whether a message has already been delivered to its final destination}
Consider the following general algorithm for checking whether a message~$m$ has already been delivered to its final destination~$\eta$: One can simply scan the last several blocks belonging to the shardchain containing the destination address, starting from the latest block and working backwards through the previous block references. (If there are two previous blocks---i.e., if a shardchain merge event occurred at some point---one would follow the chain containing the destination address.) The {\em InMsgDescr\/} of each of these blocks can be checked for an entry with key $\Hash(m)$. If such an entry is found, the message $m$ has already been delivered, and we can easily construct a Merkle proof of this fact. If we do not find such an entry before arriving at a block $B$ with $\LT^+(B)<\LT(m)$, implying that $m$ could not be delivered in~$B$ or any of its predecessors, then the message $m$ definitely has not been delivered yet.
The obvious disadvantage of this algorithm is that, if message $m$ is very old (and most likely delivered a long time ago), meaning that it has a small value of $\LT(m)$, then a large number of blocks will need to be scanned before yielding an answer. Furthermore, if the answer is negative, the size of the Merkle proof of this fact will increase linearly with the number of blocks scanned.
\nxsubpoint\label{sp:ihr.deliver.chk}\emb{Checking whether an IHR message has already been delivered to its final destination}
To check whether an IHR message $m$ has already been delivered to its destination shardchain, we can apply the general algorithm described above (cf.~\ptref{sp:msg.deliver.chk}), modified to inspect only the last $c$ blocks for some small constant $c$ (say, $c=8$). If no conclusion can be reached after inspecting these blocks, then the validators for the destination shardchain may simply discard the IHR message instead of spending more resources on this check.
\nxsubpoint\emb{Checking whether an HR message has already been delivered via HR to its final destination or an intermediate shardchain}
To check whether an HR-received message $m$ (or rather, a message $m$ being considered for import via HR) has already been imported via HR, we can use the following algorithm: Let $\xi_k$ be the transit address of $m$ (belonging to a neighboring shardchain $S_k$) and $\xi_{k+1}$ be its next-hop address (belonging to the shardchain under consideration). Since we are considering the inclusion of $m$, $m$ must be present in the {\em OutMsgQueue\/} of the most recent state of shardchain $S_k$, with $\xi_k$ and $\xi_{k+1}$ indicated in its envelope. In particular, (a) the message has been included into {\em OutMsgQueue}, and we may even know when, because the entry in {\em OutMsgQueue\/} sometimes contains the logical time of the block where it has been added, and (b) it has not yet been removed from {\em OutMsgQueue}.
Now, the validators of the neighboring shardchain are required to remove a message from {\em OutMsgQueue\/} as soon as they observe that message (with transit and next-hop addresses $\xi_k$ and $\xi_{k+1}$ in its envelope) has been imported into the {\em InMsgDescr\/} of the message's next-hop shardchain. Therefore, (b) implies that the message could have been imported into the {\em InMsgDescr\/} of a preceding block only if this preceding block is very new (i.e., not yet known to the most recent neighboring shardchain block). Therefore, only a very limited number of preceding blocks (typically one or two, at most) need to be scanned by the algorithm described in~\ptref{sp:msg.deliver.chk} to conclude that the message has not yet been imported.\footnote{One must not only look up the key $\Hash(m)$ in the {\em InMsgDescr\/} of these blocks, but also check the intermediate addresses in the envelope of the corresponding entry, if found.} In fact, if this check is performed by the validators or collators for the current shardchain themselves, it can be optimized by keeping in memory the {\em InMsgDescr\/}s of the several latest blocks.
\nxsubpoint\label{sp:hr.ihr.deliver.chk}\emb{Checking whether an HR message has already been delivered via IHR to its final destination}
Finally, to check whether an HR message has already been delivered to its final destination via IHR, one can use the general algorithm described in~\ptref{sp:msg.deliver.chk}. In contrast with \ptref{sp:ihr.deliver.chk}, we cannot abort the verification process after scanning a fixed number of the latest blocks in the destination shardchain, because HR messages cannot be dropped without a reason.
Instead, we indirectly bound the number of blocks to be inspected by forbidding the inclusion of IHR message $m$ into a block $B$ of its destination shardchain if there are already more than, say, $c=8$ blocks $B'$ in the destination shardchain with $\LT^+(B')\geq\LT(m)$.
Such a condition effectively restricts the time interval after the creation of message~$m$ in which it could have been delivered via IHR, so that only a small number of blocks of the destination shardchain (at most $c$) will need to be inspected.
Notice that this condition nicely aligns with the modified algorithm described in~\ptref{sp:ihr.deliver.chk}, effectively forbidding the validators from importing the newly-received IHR message if more than $c=8$ steps are needed to check that it had not been imported already.
\clearpage
\mysection{Messages, message descriptors, and queues}\label{sect:msg}
This chapter presents the internal layout of individual messages, message descriptors (such as {\em InMsgDescr\/} or {\em OutMsgDescr}), and message queues (such as {\em OutMsgQueue}). Enveloped messages (cf.~\ptref{sp:msg.env}) are also discussed here.
Notice that most general conventions related to messages must be obeyed by all shardchains, even if they do not belong to the basic shardchain; otherwise, messaging and interaction between different workchains would not be possible. It is the {\em interpretation\/} of the message contents and the {\em processing\/} of messages, usually by some transactions, that differs between workchains.
\mysubsection{Address, currency, and message layout}
This chapter begins with some general definitions, followed by the precise layout of addresses used for serializing source and destination addresses in a message.
\nxsubpoint\label{sp:tl.std.def}\emb{Some standard definitions}
For the reader's convenience, we reproduce here several general TL-B definitions.\footnote{A description of an older version of TL may be found at \url{https://core.telegram.org/mtproto/TL}. Alternatively, an informal introduction to TL-B schemes may be found in \cite[3.3.4]{TVM}.} These definitions are used below in the discussion of address and message layout, but otherwise are not related to the TON Blockchain.
\begin{verbatim}
unit$_ = Unit;
true$_ = True;
// EMPTY False;
bool_false$0 = Bool;
bool_true$1 = Bool;
nothing$0 {X:Type} = Maybe X;
just$1 {X:Type} value:X = Maybe X;
left$0 {X:Type} {Y:Type} value:X = Either X Y;
right$1 {X:Type} {Y:Type} value:Y = Either X Y;
pair$_ {X:Type} {Y:Type} first:X second:Y = Both X Y;
bit$_ _:(## 1) = Bit;
\end{verbatim}
\nxsubpoint\label{sp:addr.tl}\emb{TL-B scheme for addresses}
The serialization of source and destination addresses is defined by the following TL-B scheme:
\begin{verbatim}
addr_none$00 = MsgAddressExt;
addr_extern$01 len:(## 9) external_address:(len * Bit)
= MsgAddressExt;
anycast_info$_ depth:(## 5) rewrite_pfx:(depth * Bit) = Anycast;
addr_std$10 anycast:(Maybe Anycast)
workchain_id:int8 address:uint256 = MsgAddressInt;
addr_var$11 anycast:(Maybe Anycast) addr_len:(## 9)
workchain_id:int32 address:(addr_len * Bit) = MsgAddressInt;
_ MsgAddressInt = MsgAddress;
_ MsgAddressExt = MsgAddress;
\end{verbatim}
The two last lines define type \texttt{MsgAddress} to be the internal union of types \texttt{MsgAddressInt} and \texttt{MsgAddressExt} (not to be confused with their external union \texttt{Either MsgAddressInt MsgAddressExt} as defined in~\ptref{sp:tl.std.def}), as if the preceding four lines had been repeated with the right-hand side replaced by \texttt{MsgAddress}. In this way, type \texttt{MsgAddress} has four constructors, and types \texttt{MsgAddressInt} and \texttt{MsgAddressExt} are both subtypes of \texttt{MsgAddress}.
\nxsubpoint\emb{External addresses}
The first two constructors, \texttt{addr\_none} and \texttt{addr\_extern}, are used for source addresses of ``messages from nowhere'' (inbound external messages), and for destination addresses of ``messages to nowhere'' (outbound external messages). The \texttt{addr\_extern} constructor defines an ``external address'', which is ignored by the TON Blockchain software altogether (which treats \texttt{addr\_extern} as a longer variant of \texttt{addr\_none}), but may be used by external software for its own purposes. For example, a special external service may inspect the destination address of all outbound external messages found in all blocks of the TON Blockchain, and, if a special magic number is present in the \texttt{external\_address} field, parse the remainder as an IP address and UDP port or a (TON Network) ADNL address, and send a datagram with a copy of the message to the network address thus obtained.
\nxsubpoint\emb{Internal addresses}
The two remaining constructors, \texttt{addr\_std} and \texttt{addr\_var}, represent internal addresses. The first of them, \texttt{addr\_std}, represents a signed 8-bit $\workchainid$ (sufficient for the masterchain and for the basic workchain) and a 256-bit internal address in the selected workchain. The second of them, \texttt{addr\_var}, represents addresses in workchains with a ``large'' $\workchainid$, or internal addresses of length not equal to 256. Both of these constructors have an optional \texttt{anycast} value, absent by default, which enables ``address rewriting'' when present.\footnote{{\em Address rewriting\/} is a feature used to implement ``anycast addresses'' employed by the so-called {\em large\/} or {\em global\/} smart contracts (cf.~\cite[2.3.18]{TON}), which can have instances in several shardchains. When address rewriting is enabled, a message may be routed to and processed by a smart contract with an address coinciding with the destination address up to the first $d$ bits, where $d\leq 32$ is the ``splitting depth'' of the smart contract indicated in the {\tt anycast.depth} field (cf.~\ptref{sp:nh.anycast}). Otherwise, the addresses must match exactly.}
The validators must use \texttt{addr\_std} instead of \texttt{addr\_var} whenever possible, but must be ready to accept \texttt{addr\_var} in inbound messages. The \texttt{addr\_var} constructor is intended for future extensions.
Notice that $\workchainid$ must be a valid workchain identifier enabled in the current masterchain configuration, and the length of the internal address must be in the range allowed for the indicated workchain. For example, one cannot use $\workchainid=0$ (basic workchain) or $\workchainid=-1$ (masterchain) with addresses that are not exactly 256 bits long.
\nxsubpoint\emb{Representing Gram currency amounts}
Amounts of Grams are expressed with the aid of two types representing variable-length unsigned or signed integers, plus a type \texttt{Grams} explicitly dedicated to representing non-negative amounts of nanograms, as follows:
\begin{verbatim}
var_uint$_ {n:#} len:(#< n) value:(uint (len * 8))
= VarUInteger n;
var_int$_ {n:#} len:(#< n) value:(int (len * 8))
= VarInteger n;
nanograms$_ amount:(VarUInteger 16) = Grams;
\end{verbatim}
If one wants to represent $x$ nanograms, one selects an integer $\ell<16$ such that $x<2^{8\ell}$, and serializes first $\ell$ as an unsigned 4-bit integer, then $x$ itself as an unsigned $8\ell$-bit integer. Notice that four zero bits represent a zero amount of Grams.
Recall (cf.~\cite[A]{TON}) that the original total supply of Grams is fixed at five billion (i.e., $5\cdot10^{18}<2^{63}$ nanograms), and is expected to grow very slowly. Therefore, all the amounts of Grams encountered in practice will fit in unsigned or even signed 64-bit integers. The validators may use the 64-bit integer representation of Grams in their internal computations; however, the serialization of these values the blockchain is another matter.
\nxsubpoint\emb{Representing collections of arbitrary currencies}\label{sp:extra.curr}
Recall that the TON Blockchain allows its users to define arbitrary cryptocurrencies or tokens apart from the Gram, provided some conditions are met. Such additional cryptocurrencies are identified by 32-bit $\currencyid$s. The list of defined additional cryptocurrencies is a part of the blockchain configuration, stored in the masterchain.
When some amounts of one or several such cryptocurrencies need to be represented, a dictionary (cf.~\cite[3.3]{TVM}) with 32-bit $\currencyid$s as keys and \texttt{VarUInteger 32} values is used:
\begin{verbatim}
extra_currencies$_ dict:(HashmapE 32 (VarUInteger 32))
= ExtraCurrencyCollection;
currencies$_ grams:Grams other:ExtraCurrencyCollection
= CurrencyCollection;
\end{verbatim}
The value attached to an internal message is represented by a value of the \texttt{CurrencyCollection} type, which may describe a certain (non-negative) amount of (nano)grams as well as some additional currencies, if needed. Notice that if no additional currencies are required, \texttt{other} reduces to just one zero bit.
\nxsubpoint\label{sp:msg.layout}\emb{Message layout}
A message consists of its {\em header\/} followed by its {\em body}, or {\em payload}. The body is essentially arbitrary, to be interpreted by the destination smart contract. The message header is standard and is organized as follows:
\begin{verbatim}
int_msg_info$0 ihr_disabled:Bool bounce:Bool
src:MsgAddressInt dest:MsgAddressInt
value:CurrencyCollection ihr_fee:Grams fwd_fee:Grams
created_lt:uint64 created_at:uint32 = CommonMsgInfo;
ext_in_msg_info$10 src:MsgAddressExt dest:MsgAddressInt
import_fee:Grams = CommonMsgInfo;
ext_out_msg_info$11 src:MsgAddressInt dest:MsgAddressExt
created_lt:uint64 created_at:uint32 = CommonMsgInfo;
tick_tock$_ tick:Bool tock:Bool = TickTock;
_ split_depth:(Maybe (## 5)) special:(Maybe TickTock)
code:(Maybe ^Cell) data:(Maybe ^Cell)
library:(Maybe ^Cell) = StateInit;
message$_ {X:Type} info:CommonMsgInfo
init:(Maybe (Either StateInit ^StateInit))
body:(Either X ^X) = Message X;
\end{verbatim}
The meaning of this scheme is as follows.
Type \texttt{Message $X$} describes a message with the body (or payload) of type $X$. Its serialization starts with \texttt{info} of type \texttt{CommonMsgInfo}, which comes in three flavors: for internal messages, inbound external messages, and outbound external messages, respectively. All of them have a source address \texttt{src} and destination address \texttt{dest}, which are external or internal according to the chosen constructor. Apart from that, an internal message may bear some \texttt{value} in Grams and other defined currencies (cf.~\ptref{sp:extra.curr}), and all messages generated inside the TON Blockchain have a logical creation time \texttt{created\_lt} (cf.~\ptref{sp:lt.ton.blkch}) and creation unixtime \texttt{created\_at}, both automatically set by the generating transaction. The creation unixtime equals the creation unixtime of the block containing the generating transaction.
\nxsubpoint\emb{Forwarding and IHR fees. Total value of an internal message}
Internal messages define an \texttt{ihr\_fee} in Grams, which is subtracted from the value attached to the message and awarded to the validators of the destination shardchain if they include the message by the IHR mechanism. The \texttt{fwd\_fee} is the original total forwarding fee paid for using the HR mechanism; it is automatically computed from some configuration parameters and the size of the message at the time the message is generated.
Notice that the total value carried by a newly-created internal outbound message equals the sum of \texttt{value}, \texttt{ihr\_fee}, and \texttt{fwd\_fee}. This sum is deducted from the balance of the source account. Of these components, only \texttt{value} is always credited to the destination account on message delivery. The \texttt{fwd\_fee} is collected by the validators on the HR path from the source to the destination, and the \texttt{ihr\_fee} is either collected by the validators of the destination shardchain (if the message is delivered via IHR), or credited to the destination account.
\nxsubpoint\emb{Code and data portions contained in a message}
Apart from the common message information stored in \texttt{info}, a message can contain portions of the destination smart contract's code and data. This feature is used, for instance, in the so-called {\em constructor messages\/} (cf.~\ptref{sp:constr.msg}), which are simply internal or inbound external messages with \texttt{code} and possibly \texttt{data} fields defined in their \texttt{init} portions. If the hash of these fields is correct, and the destination smart contract has no code or data, the values from the message are used instead.\footnote{More precisely, the information from the \texttt{init} field of an inbound message is used either when the receiving account is uninitialized or frozen with the hash of {\em StateInit} equal to the one expected by the account, or when the receiving account is active, and its code or data is an external hash reference matching the hash of the code or data received in the {\em StateInit} of the message.}
\nxsubpoint\emb{Using \texttt{code} and \texttt{data} for other purposes}
Workchains other than the masterchain and the basic workchain are free to use the trees of cells referred to in the \texttt{code}, \texttt{data}, and \texttt{library} fields for their own purposes. The messaging system itself makes no assumptions about their contents; they become relevant only when a message is processed by a transaction.
\nxsubpoint\emb{Absence of an explicit gas price and gas limit}
Notice that messages do not have an explicit gas price and gas limit. Instead, the gas price is set globally by the validators for each workchain (it is a special configurable parameter), and the gas limit for each transaction has also a default value, which is a configurable parameter; the smart contract itself may lower the gas limit during its execution if so desired.
For internal messages, the initial gas limit cannot exceed the Gram value of the message divided by the current gas price. For inbound external messages, the initial gas limit is very small, and the true gas limit is set by the receiving smart contract itself, when it {\em accepts\/} the inbound message by the corresponding TVM primitive.
\nxsubpoint\emb{Deserialization of a message payload}
The payload, or body, of a message is deserialized by the receiving smart contract when executed by TVM. The messaging system itself makes no assumptions about the internal format of the payload. However, it makes sense to describe the serialization of supported inbound messages by TL or TL-B schemes with 32-bit constructor tags, so that the developers of other smart contracts will know the interface supported by a specific smart contract.
A message is always serialized inside the blockchain as the last field in a cell. Therefore, the blockchain software may assume that whatever bits and references left unparsed after parsing the fields of a \texttt{Message} preceding \texttt{body} belong to the payload $\texttt{body}:X$, without knowing anything about the serialization of the type~$X$.
\nxsubpoint\emb{Messages with empty payloads}
The payload of a message may happen to be an empty cell slice, having no data bits and no references. By convention, such messages are used for simple value transfers. The receiving smart contract is normally expected to process such messages quietly and to terminate successfully (with a zero exit code), although some smart contracts may perform non-trivial actions even when receiving a message with empty payload. For example, a smart contract may check the resulting balance, and, if it becomes sufficient for a previously postponed action, trigger this action. Alternatively, the smart contract might want to remember in its persistent storage the amount received and the corresponding sender, in order, for instance, to distribute some tokens later to each sender proportionally to the funds transferred.
Notice that even if a smart contract makes no special provisions for messages with empty payloads and throws an exception while processing such messages, the received value (minus the gas payment) will still be added to the balance of the smart contract.
\nxsubpoint\emb{Message source address and logical creation time determine its generating block}
Notice that {\em the source address and the logical creation time of an internal or an outbound external message uniquely determine the block in which the message has been generated}. Indeed, the source address determines the source shardchain, and the blocks of this shardchain are assigned non-intersecting logical time intervals, so only one of them may contain the indicated logical creation time. This is the reason why no explicit mention of the generating block is needed in messages.
\nxsubpoint\label{sp:tl.msg.env}\emb{Enveloped messages}
{\em Message envelopes\/} are used for attaching routing information, such as the current (transit) address and the next-hop address, to inbound, transit, and outbound messages (cf.~\ptref{sp:msg.env}). The message itself is kept in a separate cell and referred to from the message envelope by a cell reference.
\begin{verbatim}
interm_addr_regular$0 use_src_bits:(#<= 96)
= IntermediateAddress;
interm_addr_simple$10 workchain_id:int8 addr_pfx:(64 * Bit)
= IntermediateAddress;
interm_addr_ext$11 workchain_id:int32 addr_pfx:(64 * Bit)
= IntermediateAddress;
msg_envelope cur_addr:IntermediateAddress
next_addr:IntermediateAddress fwd_fee_remaining:Grams
msg:^(Message Any) = MsgEnvelope;
\end{verbatim}
The \texttt{IntermediateAddress} type is used to describe the intermediate addresses of a message---that is, its current (or transit) address \texttt{cur\_addr}, and its next-hop address~\texttt{next\_addr}. The first constructor \texttt{interm\_addr\_regular} represents the intermediate address using the optimization described in~\ptref{sp:repr.interm.addr}, by storing the number of the first bits of the intermediate address that are the same as in the source address; the two other explicitly store the workchain identifier and the first 64 bits of the address inside that workchain (the remaining bits can be taken from the source address). The \texttt{fwd\_fee\_remaining} field is used to explicitly represent the maximum amount of message forwarding fees that can be deducted from the message value during the remaining HR steps; it cannot exceed the value of \texttt{fwd\_fee} indicated in the message itself.
\mysubsection{Inbound message descriptors}
This section discusses {\em InMsgDescr}, the structure containing a description of all inbound messages imported into a block.\footnote{Strictly speaking, {\em InMsgDescr\/} is the {\em type\/} of this structure; we deliberately use the same notation to describe the only instance of this type in a block.}
\nxsubpoint\label{sp:inb.msg.classes}\emb{Types and sources of inbound messages}
Each inbound message mentioned in {\em InMsgDescr\/} is described by a value of type {\em InMsg\/} (an ``inbound message descriptor''), which specifies the source of the message, the reason for its being imported into this block, and some information about its ``fate''---its processing by a transaction or forwarding inside the block.
Inbound messages may be classified as follows:
\begin{itemize}
\item {\em Inbound external messages} --- Need no additional reason for being imported into the block, but must be immediately processed by a transaction in the same block.
\item {\em Internal IHR messages with destination addresses in this block} --- The reason for their being imported into the block includes a Merkle proof of their generation (i.e., their inclusion in {\em OutMsgDescr\/} of their original block). Such a message must be immediately delivered to its final destination and processed by a transaction.
\item {\em Internal messages with destinations in this block} --- The reason for their inclusion is their presence in {\em OutMsgQueue\/} of the most recent state of a neighboring shardchain,\footnote{Recall that a shardchain is considered a neighbor of itself.} or their presence in {\em OutMsgDescr} of this very block. This neighboring shardchain is completely determined by the transit address indicated in the forwarded message envelope, which is replicated in {\em InMsg\/} as well. The ``fate'' of this message is again described by a reference to the processing transaction inside the current block.
\item {\em Immediately routed internal messages} --- Essentially a subclass of the previous class of messages. In this case, the imported message is one of the outbound messages generated in this very block.
\item {\em Transit internal messages} --- Have the same reason for inclusion as the previous class of messages. However, they are not processed inside the block, but internally forwarded into {\em OutMsgDescr\/} and {\em OutMsgQueue}. This fact, along with a reference to the new envelope of the transit message, must be registered in {\em InMsg}.
\item {\em Discarded internal messages with destinations in this block} --- An internal message with a destination in this block may be imported and immediately discarded instead of being processed by a transaction if it has already been received and processed via IHR in a preceding block of this shardchain. In this case, a reference to the previous processing transaction must be provided.
\item {\em Discarded transit internal messages} --- Similarly, a transit message may be discarded immediately after import if it has already been delivered via IHR to its final destination. In this case, a Merkle proof of its processing in the final block (as an IHR message) is required.
\end{itemize}