forked from awesome-doge/TON_Paper
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathzh_catchain.tex
570 lines (444 loc) · 87 KB
/
zh_catchain.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
\documentclass[12pt,oneside]{article}
\usepackage{xeCJK}
\setCJKmainfont{SimSun}
\usepackage[T1]{fontenc}
%\usepackage{euler}
\usepackage{amssymb, amsmath, amsfonts, stmaryrd}
\usepackage[mathscr]{euscript}
\usepackage{mathrsfs}
\usepackage{theorem}
\usepackage[english]{babel}
\usepackage{bm}
\usepackage[all]{xy}
\usepackage{array}
\usepackage{multirow}
%\usepackage{chngcntr}
%\CompileMatrices
\usepackage[bookmarks=false,pdfauthor={Nikolai Durov},pdftitle={Catchain Consensus: An Outline}]{hyperref}
\usepackage{fancyhdr}
\usepackage{caption}
%
\setlength{\headheight}{15.2pt}
\pagestyle{fancy}
\renewcommand{\headrulewidth}{0.5pt}
%
\def\makepoint#1{\medbreak\noindent{\bf #1.\ }}
\def\zeropoint{\setcounter{subsection}{-1}}
\def\zerosubpoint{\setcounter{subsubsection}{-1}}
\def\nxpoint{\refstepcounter{subsection}%
\smallbreak\makepoint{\thesubsection}}
\def\nxsubpoint{\refstepcounter{subsubsection}%
\smallbreak\makepoint{\thesubsubsection}}
\def\nxsubsubpoint{\refstepcounter{paragraph}%
\makepoint{\paragraph}}
%\setcounter{secnumdepth}{4}
%\counterwithin{paragraph}{subsubsection}
\def\refpoint#1{{\rm\textbf{\ref{#1}}}}
\let\ptref=\refpoint
\def\embt(#1.){\textbf{#1.}}
\def\embtx(#1){\textbf{#1}}
\def\emb#1{\textbf{#1.}}
\long\def\nodo#1{}
%
%\def\markbothsame#1{\markboth{#1}{#1}}
\fancyhf{}
\fancyfoot[C]{\thepage}
\def\markbothsame#1{\fancyhead[C]{#1}}
\def\mysection#1{\section{#1}\fancyhead[C]{\textsc{Chapter \textbf{\thesection.} #1}}}
\def\mysubsection#1{\subsection{#1}\fancyhead[C]{\small{\textsc{\textrm{\thesubsection.} #1}}}}
\def\myappendix#1{\section{#1}\fancyhead[C]{\textsc{Appendix \textbf{\thesection.} #1}}}
%
\let\tp=\textit
\let\vr=\textit
\def\opsc#1{\operatorname{\textsc{#1}}}
\def\Max{\operatorname{Max}}
\def\Sha{\opsc{sha256}}
\def\SHA#1{\opsc{sha#1}}
\def\Int{\opsc{int}}
\def\height{\opsc{height}}
\def\LT{\opsc{Lt}}
\def\VT{\opsc{Vt}}
\def\Submit{\opsc{Submit}}
\def\Approve{\opsc{Approve}}
\def\Reject{\opsc{Reject}}
\def\PreCommit{\opsc{PreCommit}}
\def\CommitSign{\opsc{CommitSign}}
\def\Vote{\opsc{Vote}}
\def\VoteFor{\opsc{VoteFor}}
\def\wround{\vr{round}}
\def\wattempt{\vr{attempt}}
\def\wcandidate{\vr{candidate}}
\def\wsignature{\vr{signature}}
\def\bbB{{\mathbb{B}}}
\def\bbP{{\mathbb{P}}}
\def\bbF{{\mathbb{F}}}
\def\bbZ{{\mathbb{Z}}}
\def\bbN{{\mathbb{N}}}
\def\st#1{{\mathbf{#1}}}
\def\sgn{\operatorname{sgn}}
\def\caret{\^{}}
\def\cF{\mathscr{F}}
%
\hfuzz=0.8pt
\title{Catchain Consensus: An Outline}
\author{Nikolai Durov\\
TL: Dr Awesome Doge}
\begin{document}
%\pagestyle{myheadings}
\maketitle
\begin{abstract}
本文的目的是提供 Catchain 共識協議的概述,它是一種特別為 TON 區塊鏈中的塊生成和驗證而製定的拜占庭容錯(BFT)協議~\cite{TON}。該協議可能也可以用於 PoS 區塊鏈中的其他目的,但目前的實現僅使用了某些僅對這個特定問題有效的優化。
\end{abstract}
\clearpage
\tableofcontents
\clearpage
\mysection{Overview}\label{sect:overview}
Catchain 共識協議建立在 TON Network (\cite{TON}) 的疊加網路構建協議和疊加網路廣播協議之上。Catchain 共識協議本身可以分解為兩個獨立的協議,一個更低層次和通用的協議({\em Catchain 協議/}\footnote{在研究和開發階段的初期,該協議的原始名稱是{\em catch-chain}或{\em catchchain},因為它本質上是一個專門用於捕獲共識協議中所有重要事件的特殊{\em 鏈};在多次說和寫這個名字之後,它逐漸縮短為“catchain”。}),另一個是高層次的{\em 塊共識協議(BCP)},它使用了 Catchain 協議。在 TON 協議棧中,更高的層次被塊生成和驗證層佔據;然而,所有這些都基本上在一個(邏輯)機器上本地執行,而新生成的塊達成共識的問題被委派給 Catchain 協議層。
以下是 TON 區塊生成和分佈所使用的協議堆棧的近似圖表,顯示了 Catchain Consensus 協議(或其兩個組件協議)的正確位置:
\begin{itemize}
\item {\it 頂層:} 區塊生成和區塊驗證軟體,邏輯上運行在獨立的邏輯機器上,所有的輸入和輸出都由較低級的協議處理。這個軟體的工作是生成區塊鏈(shardchain 或 TON 區塊鏈的主鏈;有關 shardchain 和主鏈的討論,請參見 \cite{TON}),或檢查由他人生成的區塊的有效性。
\item {\it (TON) 區塊共識協議:} 在主鏈或 shardchain 的當前驗證者組中實現(拜占庭容錯)共識,以接受下一個區塊。這一級別利用區塊生成和驗證軟體的(抽象接口),並在較低級的 Catchain 協議之上構建。有關此協議的詳細信息,請參見第 \ptref{sect:blk.consensus} 節。
\item {\it Catchain 協議:} 在疊加網絡中提供安全的持久性廣播(例如,特定 shardchain 的驗證者任務組或用於在此 shardchain 或主鏈中生成、驗證和傳播新區塊的主鏈),並檢測一些參與者的「欺詐」(協議違規)嘗試。有關此協議的詳細信息,請參見第 \ptref{sect:catchain} 節。
\item {\it (TON Network) 疊加廣播協議:} 是 TON Network 中疊加網絡的一個簡單盡力而為的廣播協議,如 \cite{TON} 所述。只需將接收到的廣播訊息廣播到在同一個疊加網絡中且尚未收到這些訊息副本的所有鄰居,並盡最小的努力在短時間內保留未傳遞的廣播訊息的副本。
\item {\it (TON Network) 疊加協議:} 在 ADNL 協議網絡中創建疊加網絡(參見 \cite{TON}),管理這些疊加網絡的鄰居列表。疊加網絡的每個參與者在同一個疊加網絡中跟踪多個鄰居,並保持專門的 ADNL 連接(稱為「通道」),以便傳入訊息可以以最小的開銷高效地廣播到所有鄰居。
\item {\it 抽象數據報文網絡層(ADNL)協議:} 是 TON Network 的基本協議,僅使用 256 位抽象(ADNL)地址識別僅由其加密金鑰(或其哈希)識別的網絡節點之間的封包(數據報文)。
\end{itemize}
本文旨在僅描述此套件中的第二個和第三個協議,即 (TON) 區塊共識協議和 (TON) Catchain 協議。
我們在此指出,本文的作者雖然提供了此協議應該如何設計的一般指南(基於「讓我們創建一個 BFT-hardened 群組廣播訊息系統,並在此系統之上運行適當適應的簡單的兩階段或三階段提交協議」的思路),並參與了協議的開發和實現過程中的多次討論,但絕不是此協議以及其當前實現的唯一設計者。這是多個人的工作。
對於合併的 Catchain Consensus 協議的效率,我們有幾句話要說。首先,它是一個真正的拜占庭容錯 (Byzantine Fault Tolerant, BFT) 協議,即使一些參與者 (驗證者) 表現出任意惡意行為,只要這些惡意參與者少於驗證者總數的三分之一,它也能夠最終達成對於區塊鏈上下一個有效區塊的共識。已經廣泛認識到,如果參與者中至少有三分之一是惡意的,那麼實現 BFT 共識是不可能的 (參見 \cite{Byzantine}),因此,在這方面,Catchain Consensus 協議在理論上是最好的。
其次,當 Catchain Consensus 協議首次在全球分佈的多達 300 個節點上實現並進行測試時 (在 2018 年 12 月),即使一些節點未能參與或表現不正確的行為,它也能夠在 300 個節點上在 6 秒內達成新區塊的共識,在 100 個節點上則在 4-5 秒內完成 (在 10 個節點上則為 3 秒)。\footnote{當惡意、未參與或非常緩慢的驗證者的比例增長到三分之一時,協議會出現優雅的退化,區塊共識時間增長非常緩慢,最多增長半秒,直到幾乎達到三分之一的臨界值。} 由於 TON 區塊鏈任務組不會超過 100 個驗證者 (即使總共有一千或一萬個驗證者在運行,只有 100 個擁有最大股份的驗證者將生成新的主鏈區塊,其他驗證者只會參與新的分片鏈區塊的生成,每個分片鏈區塊由 10-30 個驗證者生成和驗證;當然,這裡給出的所有數字都是配置參數 (參見 \cite{TON} 和 \cite{TBC}),如果需要,可以通過驗證者的共識投票進行調整),這意味著 TON 區塊鏈能夠按照最初的計劃每 4-5 秒生成新區塊。這個承諾在幾個月後 (2019 年 3 月) 推出的 TON 區塊鏈測試網絡中得到了進一步的驗證和履行。因此,我們可以看出,Catchain Consensus 協議是實用 BFT 協議不斷增長家族中的新成員 (參見 \cite{PBFT}),即使它基於略有不同的原則。
\clearpage
\mysection{Catchain 協議}\label{sect:catchain}
在概述部分(參見\ptref{sect:overview})中,我們已經解釋了 TON 區塊鏈所使用的 BFT 共識協議,以實現對新區塊鏈塊的共識。該共識協議由兩個協議組成。在此,我們提供了對這兩個協議中的低層協議——Catchain 協議的簡要描述。該協議除了用於 BFT 共識協議外,還有可能用於其他用途。Catchain 協議的源代碼位於源代碼目錄下的 {\tt catchain} 子目錄中。
\nxpoint\emb{Prerequisites for running the Catchain protocol}\label{p:cc.prereq}
運行 Catchain 協議(一個實例)的主要前提條件是有參與(或被允許參與)該協議實例的所有節點的有序列表。該列表包含所有參與節點的公鑰和 ADNL 地址。在創建 Catchain 協議實例時,需要從外部提供此列表。
\nxpoint\emb{Nodes participating in the block consensus protocol}\label{p:cc.nodes}
對於為 TON 區塊鏈的某一個區塊鏈(即主區塊鏈或活動分片鏈)創建新區塊的特定任務,會創建一個由多個驗證者組成的特殊任務組。此任務組成員的列表既用於在 ADNL 內創建私有疊加網絡(這意味著只有在其創建期間明確列出的節點才能加入此疊加網絡),也用於運行相應的 Catchain 協議實例。
成員列表的構建是整個協議堆棧(即區塊創建和驗證軟件)的高層級責任,因此不是本文的主題(\cite{TBC} 是更適當的參考文獻)。目前只需要知道,此列表是當前(最近)主區塊鏈狀態(尤其是配置參數的當前值,例如所有選舉用於創建新區塊的驗證者的活動列表以及其各自的權重)的一個確定性函數。由於列表是確定性計算的,所有驗證者計算出相同的列表,特別是每個驗證者知道自己參與的任務組(即 Catchain 協議實例),無需進一步進行網絡通信或協商。\footnote{如果某些驗證者的主區塊鏈狀態已過時,它們可能無法計算出正確的任務組列表,也無法參與相應的 Catchain;在這方面,它們被視為惡意或故障,只要不到三分之一的所有驗證者以此方式失敗,就不會影響整個 BFT 協議的有效性。}
\nxsubpoint\emb{Catchains are created in advance}
實際上,不僅計算了上述列表的當前值,還計算了它們的立即後續(未來)值,以便提前創建 Catchain。這樣,當驗證者任務組的新實例需要創建第一個區塊時,Catchain 已經準備就緒。
\nxpoint\emb{The genesis block and the identifier of a catchain}\label{sp:cc.ident}
一個 {\em catchain}(即 Catchain 協議的一個實例)的特徵是它的 {\em 創世塊} 或 {\em 創世消息}。它是一個簡單的數據結構,包含一些魔數、catchain 的目的(例如要生成區塊的分片鏈的標識符,以及所謂的 {\em catchain 序列號},也是從主區塊鏈配置中獲得,用於區分生成“相同”的分片鏈的後續 catchain 實例,但可能具有不同的參與驗證者),最重要的是所有參與節點的列表(其 ADNL 地址和 Ed25519 公鑰,如 \ptref{p:cc.prereq} 所述)。Catchain 協議本身僅使用此列表和整個數據結構的 $\Sha$ 哈希值;該哈希值用作 catchain 的內部標識符,即該 Catchain 協議的特定實例的標識符。
\nxsubpoint\emb{Distribution of the genesis block}
請注意,創世塊不會在參與節點之間分發;相反,每個參與節點都會獨立計算,如 \ptref{p:cc.nodes} 所述。由於創世塊的哈希用作 catchain 的標識符(即 Catchain 協議的特定實例的標識符;參見\ptref{sp:cc.ident}),如果節點(意外或故意地)計算出不同的創世塊,它將被有效地鎖定,無法參與“正確”的協議實例。
\nxsubpoint\emb{List of nodes participating in a catchain}請注意,參與 Catchain 的節點(有序)列表在創世塊中固定,因此它對所有參與者都是已知的,並且可以通過創世塊的哈希(即 catchain 標識符)明確確定,前提是 $\Sha$ 沒有(已知的)碰撞。因此,在下面討論一個特定的 catchain 時,我們固定參與節點的數量 $N$,並假設節點從 $1$ 到 $N$ 編號(可以使用此範圍內的索引在參與者列表中查找它們的真實身份)。所有參與者的集合將用 $I$ 表示;我們假設 $I={1\ldots N}$。
\nxpoint\emb{Messages in a catchain. Catchain as a process group}
一個觀點是,Catchain 是由 $N$ 個已知且固定的「(通信)進程」(或前面所述的「節點」)組成的「(分布式)進程組」,這些進程生成「廣播消息」,最終廣播給進程組中的所有成員。所有進程的集合由 $I$ 表示;通常假設 $I={1\ldots N}$。每個進程生成的廣播從一開始編號,因此進程 $i$ 的第 $n$ 個廣播會收到「序列號」或「高度」$n$;每個廣播都應該由發起進程的身份或索引 $i$ 和其高度 $n$ 唯一確定,因此我們可以將 $(i,n)$ 視為進程組內廣播消息的自然標識符。\footnote{在 Catchain 的拜占庭環境中,這在某些情況下不一定是真的。} 由同一進程 $i$ 生成的廣播消息預計以完全相同的順序傳送到每個其他進程,即按照它們的高度遞增的順序。在這方面,Catchain 和 \cite{Birman} 或 \cite{DistrSys} 中的進程組非常相似。其主要區別在於,Catchain 是進程組的「強化」版本,能夠容忍某些參與者的可能的拜占庭(任意惡意)行為。
\nxsubpoint\emb{Dependence relation on messages}
可以在進程組中廣播的所有消息上引入一個「依賴關係」。該關係必須是一個嚴格的偏序 $\prec$,滿足 $m_{i,k}\prec m_{i,k+1}$ 的性質,其中 $m_{i,k}$ 表示具有索引 $i$ 的進程成員廣播的第 $k$ 條消息。$m\prec m'$ 的含義是 {\em $m'$ 依賴於 $m$},因此(廣播)消息 $m'$ 只能在 $m$ 被處理(由進程組的成員處理)之後才能被處理。例如,如果消息 $m'$ 表示進程組成員對另一個消息 $m$ 的反應,則將 $m\prec m'$ 設置為自然的。如果進程組的一個成員在其依賴關係中的消息 $m\prec m'$ 都被處理(或已被高級協議「交付」)之前收到了消息 $m'$,則其處理(或「交付」)被延遲,直到所有依賴項都被交付。
我們將依賴關係定義為嚴格偏序,因此它必須是可傳遞性的($m''\prec m'$ 和 $m'\prec m$ 意味著 $m''\prec m$),反對稱的(對於任何兩條消息 $m$ 和 $m'$,$m'\prec m$ 和 $m\prec m'$ 最多只能有一條成立)和反自反的($m\prec m$ 從未成立)。如果我們有一個更小的「基本依賴關係」集合 $m'\to m$,則可以構造其可傳遞性閉包 $\to^+$,並將 $\prec:=\to^+$。唯一的其他要求是每個發送者的廣播都依賴於相同發送者的所有先前廣播。嚴格來說,不一定需要假設這一點;然而,這種假設非常自然且大大簡化了進程組內部消息系統的設計,因此 Catchain 協議採用了這個假設。
\nxsubpoint\label{sp:dep.cone}\emb{Dependence set or cone of a message}
假設 $m$ 是如上所述的進程組中的(廣播)消息。我們稱集合 $D_m:={m',:,m'\prec m}$ 為消息 $m$ 的 {\em 依賴集合/} 或 {\em 依賴錐/}。換句話說,$D_m$ 是在所有消息的有限偏序集合中由 $m$ 生成的 {\em 主理想集合/}。它正是在 $m$ 交付之前必須交付的所有消息的集合。
\nxsubpoint\label{sp:ext.dep.cone}\emb{Extended dependence cone of a message}
我們還通過 $D^+_m:=D_m\cup{m}$ 定義 $m$ 的 {\em 擴展依賴錐}。
\nxsubpoint\emb{Cones, or ideals with respect to $\prec$}
更一般地,我們稱消息的子集 $D$ 為 {\em 錐},如果它對於依賴關係 $\prec$ 是一個理想,即如果 $m\in D$ 和 $m'\prec m$ 則意味著 $m'\in D$。當然,任何消息 $m$ 的依賴錐 $D_m$ 和擴展依賴錐 $D^+_m$ 都是錐(因為在偏序集合中的任何主理想都是一個理想)。
\nxsubpoint\emb{Identification of cones with the aid of vector time}
回想一下,我們已經假設任何消息都依賴於同一發送者的所有先前消息,即對於任何 $i\in I$ 和任何 $s>0$,滿足 $m_{i,s+1}$ 存在,有 $m_{i,s}\prec m_{i,s+1}$。這意味著任何錐 $D$ 都由 $N$ 個由 $i\in I$ 索引的值 $\VT(D)_i$ 完全描述:
\begin{equation}
\VT(D)_i:=\sup\{s\in\bbN\,:\,m_{i,s}\in D\}=\inf\{s\in\bbN_0\,:\,m_{i,s+1}\not\in D\}
\end{equation}
(如果沒有消息 $m_{i,s}$ 在 $D$ 中,我們設置 $\VT(D)_i:=0$)。實際上,很明顯:
\begin{equation}
m_{i,s}\in D\Leftrightarrow s\leq\VT(D)_i
\end{equation}
We say that the vector $\VT(D)=(\VT(D)_i)_{i\in I}\in\bbN_0^I$ with non-negative components $\VT(D)_i$ is the {\em vector time\/} or {\em vector timestamp\/} corresponding to cone~$D$ (cf.~\cite{Birman} or \cite{DistrSys} for a more detailed discussion of vector time).
\nxsubpoint\emb{Partial order on vector timestamps}
我們引入一個偏序 $\leq$ 在所有可能的向量時間 $\bbN_0^I$ 上,這是 $\bbN_0$ 上的傳統順序的乘積:
\begin{equation}
{\bm x}=(x_i)_{i\in I}\leq{\bm y}=(y_i)_{i\in I}\quad\text{iff}\quad x_i\leq y_i\quad\text{for all $i\in I$}
\end{equation}
很明顯,如果且僅如果 $\VT(D)\leq\VT(D')$,則 $D\subset D'$;因此,$\VT$ 是包含在所有消息中的所有錐的集合到 $\bbN_0^I$ 的嚴格保序嵌入。
\nxsubpoint\emb{Vector timestamp $\VT(m)$ of a message $m$}
對於任何消息 $m$,我們定義其向量時間戳 $\VT(m)$ 為 $\VT(D_m)$。換句話說,只有在傳送者 $j$ 生成的前 $\VT(m)_j$ 個消息被傳遞後,才能傳遞消息 $m$,對於所有 $j\in I$ 都是如此。
如果 $i$ 是消息 $m$ 的發送者,$s$ 是消息 $m$ 的高度,使得 $m=m_{i,s}$,那麼 $\VT(m)_i=s-1$。我們可以通過設置 $VT^+(m)_j=VT(m)_j$($j\neq i$),$VT^+(m)_i=\VT(m)_i+1=s$ 定義消息 $m$ 的{\em 調整向量時間戳 $\VT^+(m)$}。或者,$\VT^+(m)=\VT(D^+_m)$,其中 $D^+_m:=D_m\cup{m}$ 是消息 $m$ 的{\em 擴展依賴錐}(參見\ptref{sp:ext.dep.cone})。
注意,對於 $m'\preceq m$,當且僅當 $D^+_{m'}\subset D^+m$,當且僅當 $\VT^+(m')\leq\VT^+(m)$ 在 $\bbN_0^I$ 中成立,其中 $m'\preceq m$ 的意思是“$m'\prec m$ 或 $m'=m$”。同樣地,$m'\prec m$ 當且僅當 $D^+{m'}\subset D_m$,當且僅當 $\VT^+(m')\leq\VT(m)$。換句話說,{\em 消息(某些或全部)之間的依賴關係 $\prec$ 完全由這些消息的調整向量時間戳確定。}
\nxsubpoint\emb{Using vector timestamps to correctly deliver broadcast messages}
在非拜占庭設置中,向量時間戳可以用於正確傳遞進程組中廣播的消息。\footnote{我們假設進程組中的所有廣播消息都是“因果廣播”或“cbcast”(如\cite{Birman}中的術語),因為我們只需要 cbcast 來實現 Catchain 協議和 Catchain 共識。}具體來說,假設每個廣播消息 $m=m_{i,s}$ 包含其發送者 $i$ 的索引和此消息的向量時間戳 $\VT(m)$。然後,每個接收者 $j$ 都知道消息是否可以被傳遞。為此,$j$ 保持了到目前為止傳遞的所有消息的錐 $C_j$,例如通過維護與 $\VT(C_j)$ 相等的 {\em 當前時間戳} $\VT(j)$。換句話說,$\VT(j)_k$ 是到目前為止 $j$ 處理的發送者 $k$ 的消息計數。如果 $\VT(m)\leq\VT(j)$,則消息 $m$ 立即傳遞,然後 $\VT(j)$ 更新為 $\sup(\VT(j),\VT^+(m))$;這相當於將消息 $m$ 的原始發送者 $i$ 的 $\VT(j)_i$ 增加一。如果不滿足此條件,則 $m$ 可能會被放入等待隊列中,直到 $\VT(j)$ 變得足夠大。除了被動地等待所需的廣播消息外,$j$ 還可以構建消息索引列表 $(i',s')$,這些消息在某個接收但未傳遞的消息 $m$ 的 $\VT(m)$ 中被隱含提到,並向從中 $j$ 得知 $m$ 和 $\VT(m)$ 的鄰居請求這些消息;另一種策略(實際上是當前 Catchain 協議實現所使用的策略)是定期從隨機選擇的鄰居那裡請求這些消息。後一種策略更簡單,因為它不需要記住所有接收到的消息的直接來源(這些來源可能無論如何都不可用)。
\nxpoint\emb{Message structure in a catchain. Catchain as a multi-blockchain}在 Catchain 中,由於需要支持 BFT 協議,消息結構比上述描述複雜一些。特別是,在拜占庭設置中,向量時間戳是不夠的。它們必須通過依賴錐的極大元素的描述來補充(這些描述通常僅在過程組非常大時使用,以至於向量時間戳的大小變得不可行)。
\nxsubpoint\emb{Describing cones by means of their maximal elements}描述消息錐 $D$ 的另一種方法(與使用向量時間戳不同)是列出其所有的 {\em 極大元素} $\Max(D)$,即 $m\in D$ 且對於任何 $m'\in D$,均不滿足 $m\prec m'$。當然,為了使這種表示法實用,需要一種適合的方式來引用消息而不完全包含它們。
\nxsubpoint\emb{Message identifiers inside a catchain}Catchain 協議使用消息的(適當序列化的){\em $\Sha$} 哈希作為它們的唯一標識符。如果我們假設 $\Sha$ 沒有碰撞(可在合理的時間內計算,例如多項式時間),那麼消息 $m$ 就可以通過其哈希 $\Sha(m)$ 在進程組內被完全識別。
\nxsubpoint\emb{Message headers}\label{sp:msg.hdr}在 Catchain 中,一條消息 $m=m_{i,s}$ 的標頭(即 Catchain 協議的一個實例)始終包含其發送者的索引 $i$、高度 $s$、Catchain 識別符(即起源消息的哈希,參見 \ptref{sp:cc.ident})和依賴錐的極大元素的哈希集合,即集合 ${\Sha(m'),:,m'\in\Max(D_m)}$。特別地,由於 $m_{i,s-1}\in\Max(D_m)$(如果 $s>1$),因此前一條相同發送者的消息 $m_{i,s-1}$ 的哈希 $\Sha(m_{i,s-1})$ 總是包含在內;出於性能原因,在消息標頭中還有一個單獨的字段包含 $\Sha(m_{i,s-1})$。如果 $s=1$,那麼就沒有前一條消息,因此使用起源消息的哈希(即 Catchain 識別符,參見 \ptref{sp:cc.ident})代替。
消息標頭並不包含向量時間戳 $\VT(m)$;然而,標頭隱含地決定了 $\VT(m)$,因為:
\begin{equation}
\VT(m)=\sup_{m'\in D_m}\VT^+(m')=\sup_{m'\in\Max(D_m)}\VT^+(m')
\end{equation}
請注意,消息標頭是消息的一部分,特別地,消息的哈希(即消息識別符)取決於標頭中列出的所有數據。因此,我們假設如果 $\Sha$ 沒有已知的碰撞,則消息識別符隱含地確定了相應消息的所有依賴關係。
\nxsubpoint\emb{Message signatures}
除此之外,在 Catchain 中的每個消息都由其創建者簽名。由於在 Catchain 中參與的節點(進程)列表事先已知,且該列表包含所有進程的公鑰,因此接收進程可以在接收到消息後立即檢查這些消息的簽名。如果簽名無效,則消息將被丟棄而不進行任何進一步的處理。
\nxsubpoint\emb{Message encryption}在從私有覆蓋網絡中的一個節點傳輸到其鄰居之前,Catchain 中的所有消息都會被加密。然而,這種加密是由低層網絡協議(如 ADNL)執行的,與本文討論的內容無關。我們想提到的是,這裡正確的加密之所以可能是因為參與進程的列表不僅包含所有進程的公鑰,還包括它們的 ADNL 地址(對於網絡傳輸,這些地址實際上是公共加密金鑰)。
需要注意的是,即使沒有加密,這也不會違反協議的 BFT 屬性,因為由於簽名,偽造來自其他發送者的消息是不可能的。然而,這可能會導致對外觀察者的信息泄露,這通常是不可取的。
\nxsubpoint\emb{Alternative perspective: a catchain as a multi-blockchain}需要注意的是,在 Catchain 中由同一發送者 $i$ 創建的所有消息都具有一個簡單的“區塊鏈結構”,因為 $m_{i,s+1}$ 的標頭包含發送者 $i$ 的前一條消息 $m_{i,s}$ 的哈希 $\Sha(m_{i,s})$(以及來自 $\Max(D_{m_{i,s+1}})$ 中其他消息的哈希)。這樣,每個進程 $i$ 生成了一個由其消息組成的簡單區塊鏈,其中此區塊鏈的每個“區塊”對應一條消息,並通過其哈希參考前一個區塊,有時還在其區塊中提及其他進程的區塊(即消息)的哈希作為參考。每個區塊都由其創建者簽名。所得到的結構非常類似於 \cite[5]{TON} 中考慮的“異步支付通道”的結構,但參與者數量為 $N$ 而不是 2。
\nxpoint\emb{Message propagation in a catchain}\label{sp:cc.msg.prop}
現在我們可以描述 Catchain 中的消息傳播。具體而言:
\begin{itemize}
\item 在 Catchain 下面的(低層次)覆蓋網路協議維護了一個鄰居列表,並為每個鄰居提供 ADNL 通道。這個私有覆蓋網路具有與 Catchain 相同的成員(進程、節點)列表,每個節點的鄰居形成了所有參與節點的(定向)子圖。這個(基本上是隨機的)子圖的強聯通概率非常接近於1。
\item 每個進程不時會生成一些新的消息(根據高層協議的需要)。這些消息按照~\ptref{sp:msg.hdr}中所述的方法增強了 Catchain 消息標頭,簽名後通過覆蓋協議建立的 ADNL 通道傳播到所有已知的鄰居。
\item 與通常的簡單覆蓋廣播協議不同,從鄰居那裡接收到的消息不會立即轉發給那些尚未知道它們的副本的所有其他鄰居。相反,首先檢查簽名,並且無效的消息被丟棄。然後,如果它所有的相依消息已經被傳遞,消息就被傳送;否則,消息就被放入等待隊列中。在後一種情況下,標頭中提到的所有所需消息(即集合 $\Max(D_m)$)都從發送此消息的鄰居拉取(此外,不時從隨機的鄰居嘗試下載這些缺失的消息)。如果必要,這個過程會遞歸地重複進行,直到一些消息可以被傳送。一旦消息準備好進行本地傳送(即它的所有相依關係都已經存在),它也會被重新發送到覆蓋網路中的所有鄰居。
\item 除了上述的遞歸“拉”機制外,還使用了一種更快的基於向量時間戳的機制,以便可以通過消息的發件人和高度(從接收到的消息的向量時間戳中學習)從鄰居查詢消息。即,每個進程不時向一個隨機選擇的鄰居發送一個包含當前向量時間戳的特殊查詢。這個點對點查詢導致其接收者回傳對發送者未知的所有或一些消息(根據它們的向量時間戳判斷)。
\item 一旦從某些發件人發出的消息被檢測到出現“分叉”(例如,在快速或慢速的“拉”過程中,從鄰居那裡學習到了具有相同發件人 $i$ 和高度 $s$,但哈希值不同的第二個消息),更快的基於向量時間戳的機制就會被禁用。一旦檢測到 $i$ 創建的分叉,所有後續向量時間戳的相應分量 $\VT_i$ 就會被設置為一個特殊值 $\infty$,以表示不再有意義比較這些分量的值。
\item 當一條消息被傳送(到高層協議)時,這條消息被添加到當前進程處理過的消息錐 $C$ 中(並相應地更新當前向量時間戳),並且假定當前進程生成的所有後續消息都依賴於到目前為止傳送的所有消息(即使從高層協議的角度來看這並不是邏輯上必要的)。
\item 如果處理的消息錐 $\Max(C)$ 的最大元素集合變得太大(包含的元素比 Catchain 的创世消息事先固定的某个数量还多),那么 Catchain 协议会要求高层协议生成一条新消息(如果没有可用的有用有效负载,则为空)。在生成此新消息(并立即将其传递给当前进程)后,$C$ 被更新,$\Max(C)$ 只包含一个元素(即新消息)。通过这种方式,$\Max(C)$ 的大小以及消息头的大小始终保持有限。
\item 一旦消息 ~$m$ 被傳送並且集合 $C$ 被修改以包含此消息,就會設置一個定時器,在一些小延遲後,會要求高層協議創建一條新消息(如果必要,則為空消息),以便這條新消息 $m^*$ 會參照新的 $C$,類似於上一項中描述的過程。這個新消息 $m^*$ 會被推送給所有鄰居;由於它的標頭包含了新 $C$ 的 $\Max(C)$ 和 $m\in C$,因此鄰居不僅會了解到新生成的消息 $m^*$,還會了解到原始接收的消息 $m$。如果某些鄰居還沒有 $m$ 的副本,它們將需要一個(從當前進程或其他進程)。
\item 在 Catchain 中接收和創建的所有(廣播)消息都被存儲在一個特殊的本地數據庫中。這對於新創建的消息尤其重要(參見~\ptref{sp:new.msg.flush}):如果一個消息被創建並發送給鄰居,但在創建過程崩潰並重新啟動之前沒有保存到數據庫(並刷新到磁盤),那麼在重新啟動後可以創建具有相同發件人和高度的另一個消息,進而導致非自願的“分叉”。
\end{itemize}
\nxpoint\emb{Forks and their prevention}
可以看出,上面概述的 Catchain 的多區塊鏈結構(帶有對其他區塊的哈希引用和簽名)對於在建立在 Catchain 上的共識協議(即使用 Catchain 作為在進程組內廣播消息的手段)中的“作弊”幾乎沒有可能性。唯一無法立即檢測到的可能性是創建兩個(或更多)不同版本的同一消息 $m_{i,s}$(比如,$m'{i,s}$ 和 $m''{i,s}$),並將這個消息的一個版本 $m'{i,s}$ 發送給某些對等節點,另一個版本 $m''{i,s}$ 發送給其他對等節點。如果 $s$ 是最小的(對於固定的 $i$),那麼這對應於區塊鏈術語中的一個{\em 分叉}:同一個前一個區塊 $m_{i,s-1}$ 的兩個不同的下一個區塊 $m'{i,s}$ 和 $m''{i,s}$。
Therefore, the Catchain protocol takes care to detect forks as soon as possible and prevent their propagation.
\nxsubpoint\emb{Detection of forks}
檢測分叉很簡單:如果存在兩個不同的區塊 $m'{i,s}$ 和 $m''{i,s}$,它們的創建者 $i\in I$ 和高度 $s\geq1$ 相同,並且具有有效的 $i$ 的簽名,那麼這就是一個分叉。
\nxsubpoint\emb{Fork proofs}\label{sp:fork.proofs}
Catchain 協議中的區塊簽名是以這樣的方式創建的,即創建{\em 分叉證明}(即證明進程 $i$ 有意創建了一個分叉)特別簡單,因為它是一個非常小的結構(包含一個魔數、$i$ 和 $s$ 的值,以及其餘消息的哈希值)的哈希值實際上被簽名。因此,分叉證明只需要兩個這樣的小結構和兩個簽名即可。
\nxsubpoint\emb{External punishment for creating forks}
需要注意的是,在 PoS 區塊鏈生成的背景下,可以使用對創建 Catchain 分叉的外部懲罰。即,分叉證明可以提交給一個特殊的智能合約(例如 TON 區塊鏈的選舉人智能合約),自動檢查,並且可能會沒收冒犯方的一部分或全部權益。
\nxsubpoint\emb{Internal processing of forks}
一旦檢測到一個分叉(由 $i$ 創建),即 $j$ 學到了 $i$ 創建的兩個不同消息 $m_{i,s}$ 和 $m'_{i,s}$,並且它們具有相同的高度 $s$(通常在遞歸下載某些其他消息的相依關係時發生),$j$ 就開始忽略 $i$ 和它的所有後續消息。它們不會被接受,也不會進一步廣播。然而,在分叉檢測之前由 $i$ 創建的消息,如果在由沒有看到這個分叉的進程創建的消息(區塊)中引用,仍然可以被下載。
\nxsubpoint\emb{Accepting messages from a ``bad'' process is bad}\label{sp:no.bad.accept}
此外,如果進程 $i$ 得知進程 $j$ 創建了一個分叉,則 $i$ 會通過創建一個包含相應分叉證明的新服務廣播消息(參見~\ptref{sp:fork.proofs})向其鄰居顯示這一信息。此後,$j$ 的所有後續消息都不能直接依賴於已知的“壞”的生產者 $i$ 的任何消息(但如果 $k$ 在創建引用消息時不知道 $i$ 的分叉,則它們仍然可以參考來自另一方 $k$ 的消息,該消息直接或間接地引用了 $i$ 的消息)。如果 $j$ 違反了此限制並創建了具有此類無效引用的消息,那麼這些消息將被該組中的所有誠實進程丟棄。
\nxsubpoint\emb{The set of ``bad'' group members is a part of the intrinsic state}\label{sp:bad.proc.set}
每個進程 $i$ 都保留了其在組中所知的“壞”進程集合的副本,即那些創建了至少一個分叉或違反了~\ptref{sp:no.bad.accept} 的進程。當 $i$ 得知由 $j$ 創建的分叉(或 $j$ 違反了~\ptref{sp:no.bad.accept})時,就會將 $j$ 加入到這個集合中;之後,會調用高級協議提供的回調函數。當一個新的廣播消息到達時,就會使用這個集合:如果發送者是壞的,則消息會被忽略並丟棄。
\clearpage
\mysection{Block Consensus Protocol}\label{sect:blk.consensus}
本節我們將解釋 TON 區塊共識協議(參見~\ptref{sect:overview})的基本運作方式,該協議基於通用的 Catchain 協議(參見~\ptref{sect:catchain})構建,提供了用於生成和驗證 TON 區塊鏈新區塊的 BFT 協議。TON 區塊共識協議的源代碼位於源代碼樹的 {\tt validator-session} 子目錄中。
\nxpoint\emb{Internal state of the Block Consensus Protocol}\label{p:cc.state}
高級協議的區塊共識協議引入了一個新的概念到 Catchain 中:區塊共識協議(BCP)的{\em 內部狀態},有時也稱為“Catchain 的內部狀態”或簡稱為{\em Catchain 狀態}。即,對於每個進程 $i\in I$,在 Catchain 協議將一個消息子集(實際上總是一個相依錐) $C_i$ 傳遞到高級協議(即在本例中傳遞到區塊共識協議)之後,進程 $i\in I$ 會有一個確定的內部狀態 $\sigma_{C_i}$。此外,這個狀態 $\sigma_{C_i}=\sigma(C_i)$ 只依賴於錐 $C_i$,而不依賴於進程 $i\in I$ 的身份,並且可以對任何相依錐 $S$(不一定是某個進程 $i$ 在某個時間點上交付的一個錐 $C_i$)進行定義。
\nxsubpoint\emb{Abstract structure of the internal state}
我們從區塊共識協議使用的內部狀態的抽象結構開始,更具體的細節將在後面提供。
\nxsubpoint\emb{Updating the internal state}
Catchain 協議對內部狀態一無所知;它只是在傳遞一條消息 $m$ 時調用高級協議(即 BCP)提供的適當回調函數。由高級協議來計算新的狀態 $\sigma_{S'}$,從先前計算的狀態 $\sigma_S$ 和消息 $m$ 開始計算,其中 $S'=S\cup{m}$ (並且必然有 $S\supset D_m$,否則 $m$ 在此時不能被傳遞)。
\nxsubpoint\emb{Recursive formula for updating the internal state}\label{sp:abs.state.upd}
計算所有相依錐 $S$ 的 $\sigma_S$ 的抽象設置包括三個組件:
\begin{itemize}
\item 初始狀態的值 $\sigma_\emptyset$(實際上,此值取決於 catchain 的創世區塊;我們在這裡忽略了此依賴關係,因為我們此時只考慮一個 catchain)。
\item 一個函數 $f$,從先前狀態 $\sigma_{D_m}$ 和新傳遞的消息 $m$ 計算狀態 $\sigma_{D^+_m}$:
\begin{equation}\label{eq:state.rec}
\sigma_{D^+_m}=f(\sigma_{D_m},m)
\end{equation}
其中 $D_m$ 是消息 $m$ 的相依錐,$D^+_m=D_m\cup{m}$ 是擴展相依錐(參見 \ptref{sp:ext.dep.cone})。在大多數情況下,$f$ 實際上滿足更強的條件:
\begin{equation}\label{eq:state.rec.x}
\sigma_{S\cup\{m\}}=f(\sigma_S,m)\quad\text{if $S$ and $S\cup\{m\}$ are cones and $m\not\in S$}
\end{equation}
但這個更強的條件不是更新算法所必需的。
\item “合併函數” $g$ 從 $\sigma_S$ 和 $\sigma_T$ 計算 $\sigma_{S\cup T}$:
\begin{equation}\label{eq:state.merge}
\sigma_{S\cup T}=g(\sigma_S,\sigma_T)\quad\text{對於任何錐 $S$ 和 $T$}
\end{equation}
(兩個錐的聯集始終是一個錐)。
更新算法只在特定情況下(即 $T=D^+_m$ 且 $m\not\in S$)應用此函數 $\sigma$。
\end{itemize}
\nxsubpoint\emb{Commutativity and associativity of $g$}\label{sp:g.assoc}
請注意,對於任意的錐 $S$ 和 $T$,\eqref{eq:state.merge} 意味着 $g$ 的結合律和交換律,至少在 $g$ 應用於可能的狀態(某個錐 $S$ 的形式為 $\sigma_S$ 的值)時是成立的。在這方面,$g$ 在集合 $\Sigma={\sigma_S,:,S\text{ 是一個錐}}$ 上定義了可交換的單調結構。通常情況下,$g$ 在一個更大的狀態值集合 $\tilde\Sigma$ 上被定義或部分定義,並且在這個更大的集合 $\tilde\Sigma$ 上可能是可交換的和結合的,即對於 $x$、$y$、$z\in\tilde\Sigma$(只要等式的兩側都被定義了),$g(x,y)=g(y,x)$ 並且 $g(x,g(y,z))=g(g(x,y),z)$,其中 $\sigma_\emptyset$ 是一個單位元,即對於 $x\in\tilde S$(在相同的條件下),$g(x,\sigma_\emptyset)=x=g(\sigma_\emptyset,x)$。然而,這種性質對於共識算法的形式分析很有用,但對於狀態更新算法來說不是嚴格要求的,因為該算法使用 $g$ 以一種確定性的方式計算 $\sigma_S$。
\nxsubpoint\emb{Commutativity of $f$}
需要注意的是,如果函数 $f$ 滿足更強的條件 \eqref{eq:state.rec.x},則必須具有可交換性質
\begin{equation}\label{eq:step.upd.comm}
f\bigl(f(\sigma_S,m),m'\bigr)=f\bigl(f(\sigma_S,m'),m\bigr)
\end{equation}
當 $S$ 是錐體且 $m$ 和 $m'$ 是兩個消息,滿足 $D_m\subset S$、$D_{m'}\subset S$、$m\not\in S$ 且 $m'\not\in S$ 時。因為在這種情況下,$S\cup{m}$、$S\cup{m'}$ 和 $S\cup{m,m'}$ 也是錐體,且 \eqref{eq:state.rec.x} 意味著 \eqref{eq:step.upd.comm} 兩側均等於 $\sigma_{S\cup{m,m'}}$。類似於 \ptref{sp:g.assoc},$f$ 通常在更大的狀態類似值 $\tilde\Sigma$ 和消息類似值集合的乘積上被定義或部分定義;在更大的集合上,它可能展現出“可交換性”特徵 \eqref{eq:step.upd.comm},也可能不展現。如果它有這個特徵,這可能對於依賴於 $\sigma_S$ 的算法的形式分析是有用的,但這個特性不是嚴格必要的。
\nxsubpoint\emb{The state update algorithm}
Catchain(實際上是高級 BCP)使用 $\sigma_\emptyset$、$f$ 和 $g$ 來執行狀態更新算法(由每個進程 $i$ 獨立執行),其運作方式如下:
\begin{itemize}
\item 算法跟踪到目前為止已經傳遞的所有消息 $m$ 的 $\sigma_{D^+m}$。
\item 算法跟踪到 $\sigma{C_i}$,其中 $C_i$ 是當前依賴錐體,即所有傳遞給當前進程 $i$ 的消息 $m$ 的集合。$\sigma_{C_i}$ 的初始值為 $\sigma_\emptyset$。
\item 當一個新的消息 $m$ 被傳遞時,由於 $D_m=\bigcup_{m'\in D_m}D^+{m'}=\bigcup{m'\in\Max(D_m)}D^+{m'}$,所以可以通過重複應用 $g$ 來計算 $\sigma{D^m}$。因此,如果 $\Max(D_m)={m'1,\ldots,m'k}$,則
\begin{equation}\label{eq:merge.many}
\sigma{D_m}=g\Bigl(\ldots g\bigl(g(\sigma{D^+{m'1}},\sigma{D^+{m'2}}),\sigma{D^+{m'3}}\bigr),\ldots \sigma{D^+{m'_k}}\Bigr)\quad.
\end{equation}
在某個固定的順序 $m'_1$,\dots,$m'k$ 中,消息 $m$ 的標題明確地列出了集合 $\Max(D_m)$,上述公式是針對此順序進行的(因此計算 $D_m$ 是確定性的)。此列表中的第一個元素始終是消息發送方的前一個消息,即如果 $m=m{i,s+1}$,則 $m'1=m{i,s}$。
\item 然後,通過應用 $f$,計算 $\sigma_{D^+m}$ 的值:$\sigma{D^+m}=f(\sigma{D_m},m)$。此值被記錄下來以供將來使用。
\item 最後,當一個新的消息 $m$ 被傳遞給當前進程 $i$,從而將 $C_i$ 更新為 $C'i:=C_i\cup{m}$ 時,算法使用計算出的值 $\sigma{D^+m}$ 來更新當前狀態
\begin{equation}
\sigma{C'i}=g(\sigma{C_i},\sigma_{D^+m})
\end{equation}
然而,這個狀態是“虛擬的”,因為稍後可能會稍微更改它(尤其是如果 $g$ 不是可交換的)。儘管如此,高級算法(BCP)仍然使用它來做出一些重要的決策。
\item 當一個新的消息 $m$ 被生成並在本地傳遞,從而 $C_i$ 變為 $D^+m$,則先前計算出的 $\sigma{C_i}$ 的值被丟棄並替換為根據上述通用算法計算出的 $\sigma{D^+_m}$。如果 $g$ 不是可交換的或不是結合的(例如,可能發生 $g(x,y)$ 和 $g(y,x)$ 是不同但等效的表示同一個狀態的情況),則這可能會導致稍微更改進程 $i$ 的當前“虛擬”狀態。
\item 若低層級(catchain)協議向高層級協議報告某個進程$j\not\in i$是「不良的」(即發現$j$已經建立分叉,參見\ptref{sp:bad.proc.set},或者$j$明知其他進程建立分叉而予以支持,參見\ptref{sp:no.bad.accept}),那麼當前(虛擬)狀態$\sigma_{C_i}$會使用新的集合$C'i=\bigcup{\text{$m\in C_i$, $m$由「良好」進程$k$建立}}D^+m$重新計算,並將「合併」函數$g$應用於良好進程的最後消息集合(或該集合的極大元素集合)中的$\sigma{D^+_m}$集合。下一個創建的出站消息僅會取決於$C'_i$中的消息。
\end{itemize}
\nxsubpoint\emb{Necessity to know the internal state of the other processes}
公式\eqref{eq:merge.many}意味著進程$i$必須跟踪所有由該進程創建的消息$m$的$\sigma_{D^+m}$,以及其他進程創建的消息的$\sigma{D^+m}$。然而,由於這些內部狀態也是通過更新算法的適當應用來計算的,因此這是可能的。因此,BCP也會計算和記憶所有的$\sigma{D^+_m}$。
\nxsubpoint\emb{Function $f$ would suffice}
請注意,更新算法只在$S$是包含$D_m$但不包含$m$的錐時才應用$g$來計算$\sigma_{S\cup D^+m}=g(\sigma_S,\sigma{D^+m})$。因此,每次實際應用$g$都可以被滿足擴展性質\eqref{eq:state.rec.x}的$f$所替代:
\begin{equation}
\sigma{S\cup D^+m}=g(\sigma_S,\sigma{D^+_m})=f(\sigma_S,m)
\end{equation}
但是,更新算法不使用這種「優化」,因為它會禁用下面在\ptref{sp:share.substr}和\ptref{sp:memoize}中描述的更重要的優化。
\nxpoint\emb{The structure of the internal state}
內部狀態的結構被優化,以使得\eqref{eq:state.rec}中的{\em 過渡函數$f$}和\eqref{eq:state.merge}中的{\em 合併函數$g$}能夠盡可能高效地計算,最好不需要可能無限遞迴的過程(僅需要一些迴圈)。這促使將額外的組件納入內部狀態中(即使這些組件可以從內部狀態的其餘部分計算出來),這些組件也必須被存儲和更新。這個包含額外組件的過程類似於使用動態規劃求解問題時所使用的過程,或者類似於通過數學(或結構)歸納來證明陳述的過程。
\nxsubpoint\emb{The internal state is a representation of a value of an abstract algebraic data type}\label{sp:state.node.tree}
內部狀態的內部表示本質上是一棵(有向)樹(或者說是一個有向無環圖)或者一個節點集合;每個節點包含一些立即的(通常是整數)值和指向其他(之前建立的)節點的幾個指針。如果需要,可以在節點開頭添加一個額外的{\em 構造標籤/}(一個小整數)來區分幾種可能性。這個結構非常類似於用於表示抽象代數數據類型值的結構化編程語言(如Haskell)中的結構。
\nxsubpoint\emb{The internal state is persistent}
內部狀態是{\em 持久的},意味著在catchain活動期間,用於分配內部狀態中的節點的內存永遠不會被釋放。此外,catchain的內部狀態實際上是分配在一個巨大的連續內存緩衝區內,新節點總是在該緩衝區已使用部分的末尾進行分配,並通過移動指針進行。這樣,該緩衝區內一個節點對其他節點的引用可以由該節點與緩衝區開始處的整數偏移量來表示。每個內部狀態都是由一個指向其在此緩衝區內的根節點的指針來表示的;這個指針也可以用從緩衝區開始處的整數偏移量來表示。
\nxsubpoint\emb{The internal state of a catchain is flushed to an append-only file}\label{sp:state.apponly.file}
根據上述用於存儲catchain內部狀態的緩衝區結構,其結果是只有在其末尾添加一些新數據時才會更新。這意味著catchain的內部狀態(或者說包含所有所需內部狀態的緩衝區)可以刷新到僅追加的文件中,在重新啟動後很容易恢復。在重新啟動之前唯一需要存儲的其他數據是catchain當前狀態的偏移量(從緩衝區開始處或該文件的開始處測量)。可以使用簡單的鍵值數據庫來實現這個目的。
\nxsubpoint\label{sp:share.substr}\emb{Sharing data between different states}
結果發現,表示新狀態$\sigma_{S\cup{m}}=f(\sigma_S,m)$的樹(或者說是dag)與先前狀態$\sigma_S$共享大的子樹,同樣地,$\sigma_{S\cup T}=g(\sigma_S,\sigma_T)$與$\sigma_S$和$\sigma_T$共享大的子樹。在BCP中用於表示狀態的持久結構使得可以重複使用相同的指針來表示這些共享的數據結構,而不是將它們複製一份。
\nxsubpoint\label{sp:memoize}\emb{Memoizing nodes}
在計算新狀態(即函數$f$的值)時,使用的另一種技術是{\em 記憶化新節點},這也是從功能編程語言中借鑒過來的。具體而言,每當構造一個新節點(在包含特定catchain所有狀態的巨大緩衝區內部),都會計算它的哈希值,並使用簡單的哈希表來查找具有相同哈希值的最新節點。如果找到具有相同內容的該哈希值的節點,則新構造的節點將被丟棄,而返回對具有相同內容的舊節點的引用。另一方面,如果沒有找到新節點的副本,那麼哈希表將被更新,緩衝區(分配)指針將被推進,並將新節點的指針返回給調用者。
這樣,即使不直接通過應用函數$f$如\ptref{sp:share.substr}中所述相關,如果不同的進程最終進行相似的計算並具有相似的狀態,那麼這些狀態的大部分將被共享。
\nxsubpoint\emb{Importance of optimization techniques}
用於在同一個catchain內部共享不同內部狀態的部分的優化技術\ptref{sp:share.substr}和\ptref{sp:memoize}對於改善BCM在大型進程組中的內存占用和性能具有極其重要的作用。在$N\approx100$的進程組中,改進的幅度可以達到數個數量級。如果沒有這些優化,BCM將無法適用於其預定的目的(在TON區塊鏈中由驗證器生成新塊的BFT共識)。
\nxsubpoint\emb{Message $m$ contains a hash of state $\sigma_{D^+_m}$}
每個消息$m$都包含相應狀態$\sigma_{D^+_m}$的(Merkle)哈希值。粗略地說,使用\ptref{sp:state.node.tree}中的節點樹表示遞歸地計算此哈希值:節點內的所有節點引用都被替換為所引用節點的(遞歸計算的)哈希值,然後計算結果字節序列的簡單64位哈希值。這個哈希值也用於如\ptref{sp:memoize}所述的記憶化。
消息中此字段的目的是對不同進程(以及可能是狀態更新算法的不同實現)對$\sigma_{D^+m}$的計算進行健全性檢查:一旦為新交付的消息$m$計算了$\sigma{D^+m}$,則比較計算出的$\sigma{D^+_m}$的哈希值與$m$的標頭中存儲的值是否相等。如果這些值不相等,則在錯誤日誌中輸出錯誤消息(軟件不再執行其他操作)。可以檢查這些錯誤日誌以檢測BCP不同版本之間的錯誤或不兼容性。
\nxpoint\emb{State recovery after restart or crashes}
Catchain通常由BCP使用幾分鐘;在此期間,運行Catchain協議的程序(驗證器軟件)可能會被終止並重新啟動,無論是故意(例如,因為計劃中的軟件更新)還是意外地(該程序可能會因此或某些其他子系統中的錯誤而崩潰,然後重新啟動)。處理這種情況的一種方法是忽略所有在上次重新啟動之後未創建的catchain。但是,這將導致一些驗證器在幾分鐘內不參與任何塊的創建(直到創建下一個catchain實例),這是不希望的。因此,在每次重新啟動後,運行catchain狀態恢復協議,以便驗證器可以繼續參與同一個catchain。
\nxsubpoint\emb{Database of all delivered messages}\label{sp:msg.db}
為此,為每個活動的catchain創建一個特殊的數據庫。此數據庫包含所有已知和傳遞的消息,按其標識符(哈希值)進行索引。為此,一個簡單的鍵值數據庫就足夠了。還將存儲在此數據庫中的當前進程$i$生成的最新出站消息$m=m_{i,s}$的哈希值。重新啟動後,所有直到$m$的消息都按正確的順序遞歸性地傳遞(就像如果這些消息剛從網絡以任意順序接收到一樣),並由更高級別的協議進行處理,直到$m$最終被傳遞,從而恢復當前狀態。
\nxsubpoint\emb{Flushing new messages to disk}\label{sp:new.msg.flush}
如\ptref{sp:cc.msg.prop}所述,新創建的消息存儲在所有已傳遞消息的數據庫中(參見\ptref{sp:msg.db}),並在將新消息發送到所有網絡鄰居之前將數據庫刷新到磁盤上。通過這種方式,我們可以確保如果系統崩潰並重新啟動,消息不會丟失,從而避免意外創建分支。
\nxsubpoint\emb{Avoiding the recomputation of states $\sigma_{D^+_m}$}
實現可以使用包含所有先前計算狀態的追加文件,如\ptref{sp:state.apponly.file}所述,以避免在重新啟動後重新計算所有狀態,以犧牲磁盤空間換取計算能力。然而,當前的實現沒有使用這種優化。
\nxpoint\emb{High-level description of Block Consensus Protocol}\label{p:bcp.descr}
現在我們準備介紹 TON 區塊鏈驗證器使用的區塊共識協議的高級描述,以生成並在新區塊鏈塊上實現共識。基本上,這是一個三階段提交協議,運行在一個 Catchain 上(Catchain 協議的一個實例),它被用作一個在一個進程組中“硬化”的消息廣播系統。
\nxsubpoint\emb{Creation of new catchain messages}
回想一下,低級 Catchain 協議不會自己創建廣播消息(唯一的例外是具有分叉證明的服務廣播,參見\ptref{sp:no.bad.accept})。相反,當需要創建新消息時,通過調用回調來請求高級協議(BCP)進行此操作。此外,新消息的創建可能會受到當前虛擬狀態的變化和定時器警報的觸發。
\nxsubpoint\emb{Payload of catchain messages}\label{sp:payload}
這樣,catchain 消息的有效負載始終由更高層級的協議(例如 BCP)確定。對於 BCP,該有效負載包括以下內容:
\begin{itemize}
\item 當前的 Unix 時間。在同一進程的後續消息中,它必須不斷增加。(如果違反此限制,處理此消息的所有進程將默認地將此 Unix 時間替換為相同發送者的先前消息中看到的最大 Unix 時間。)
\item 一個或多個 {\em BCP 事件},可以是以下任一類型。
\end{itemize}
\nxsubpoint\emb{BCP events}
我們剛剛解釋了 catchain 消息的有效負載包含幾個(可能為零)BCP事件。現在我們列出所有允許的BCP事件類型。
\begin{itemize}
\item $\Submit(\wround,\wcandidate)$ --- suggest a new block candidate
\item $\Approve(\wround,\wcandidate,\wsignature)$ --- a block candidate has passed local validation
\item $\Reject(\wround,\wcandidate)$ --- a block candidate has failed local validation
\item $\CommitSign(\wround,\wcandidate,\wsignature)$ --- a block candidate has been accepted and signed
\item $\Vote(\wround,\wcandidate)$ --- a vote for a block candidate
\item $\VoteFor(\wround,\wcandidate)$ --- this block candidate must be voted for in this round (even if the current process has another opinion)
\item $\PreCommit(\wround,\wcandidate)$ --- a preliminary commitment to a block candidate (used in three-phase commit scheme)
\end{itemize}
\nxsubpoint\emb{Protocol parameters}
Several parameters of BCP must be fixed in advance (in the genesis message of the catchain, where they are initialized from the values of the configuration parameters extracted from the current masterchain state):
\begin{itemize}
\item $K$ --- duration of one attempt (in seconds). It is an integer amount of seconds in the current implementation; however, this is an implementation detail, not a restriction of the protocol
\item $Y$ --- number of {\em fast\/} attempts to accept a candidate
\item $C$ --- block candidates suggested during one round
\item $\Delta_i$ for $1\leq i\leq C$ --- delay before suggesting the block candidate with priority $i$
\item $\Delta_\infty$ --- delay before approving the null candidate
\end{itemize}
Possible values for these parameters are $K=8$, $Y=3$, $C=2$, $\Delta_i=2(i-1)$, $\Delta_\infty=2C$.
\nxsubpoint\emb{Protocol overview}
The BCP consists of several {\em rounds\/} that are executed inside the same catchain. More than one round may be active at one point of time, because some phases of a round may overlap with other phases of other rounds. Therefore, all BCP events contain an explicit round identifier $\wround$ (a small integer starting from zero). Every round is terminated either by (collectively) accepting a {\em block candidate\/} suggested by one of the participating processes, or by accepting a special {\em null candidate\/}---a dummy value indicating that no real block candidate was accepted, for example because no block candidates were suggested at all. After a round is terminated (from the perspective of a participating process), i.e., once a block candidate collects $\CommitSign$ signatures of more than $2/3$ of all validators, only $\CommitSign$ events may be added to that round; the process automatically starts participating in the next round (with the next identifier) and ignores all BCP events with different values of $\wround$.\footnote{This also means that each process implicitly determines the Unixtime of the start of the next round, and computes all delays, e.g., the block candidate submission delays, starting from this time.}
Each round is subdivided into several {\em attempts}. Each attempt lasts a predetermined time period of $K$ seconds (BCP uses clocks to measure time and time intervals and assumes that clocks of ``good'' processes are more or less in agreement with each other; therefore, BCP is not an asynchronous BFT protocol). Each attempt starts at Unixtime exactly divisible by $K$ and lasts for $K$ seconds. The attempt identifier $\wattempt$ is the Unixtime of its start divided by $K$. Therefore, the attempts are numbered more or less consecutively by 32-bit integers, but not starting from zero. The first $Y$ attempts of a round are {\em fast\/}; the remaining attempts are {\em slow}.
\nxsubpoint\emb{Attempt identification. Fast and slow attempts}
In contrast with rounds, BCP events do not have a parameter to indicate the attempt they belong to. Instead, this attempt is implicitly determined by the Unix time indicated in the payload of the catchain message containing the BCP event (cf.~\ptref{sp:payload}). Furthermore, the attempts are subdivided into {\em fast\/} (the first $Y$ attempts of a round in which a process takes part) and {\em slow\/} (the subsequent attempts of the same round). This subdivision is also implicit: the first BCP event sent by a process in a round belongs to a certain attempt, and $Y$ attempts starting from this one are considered fast by this process.
\nxsubpoint\emb{Block producers and block candidates}
There are $C$ designated block producers (member processes) in each round. The (ordered) list of these block producers is computed by a deterministic algorithm (in the simplest case, processes $i$, $i+1$, \dots, $i+C-1$ are used in the $i$-th round, with the indices taken modulo $N$, the total number of processes in the catchain) and is known to all participants without any extra communication or negotiation. The processes are ordered in this list by decreasing priority, so the first member of the list has the highest priority (i.e., if it suggests a block candidate in time, this block candidate has a very high chance to be accepted by the protocol).
The first block producer may suggest a block candidate immediately after the round starts. Other block producers can suggest block candidates only after some delay $\Delta_i$, where $i$ is the index of the producer in the list of designated block producers, with $0=\Delta_1\leq\Delta_2\leq\ldots$. After some predetermined period of time $\Delta_\infty$ elapses from the round start, a special {\em null candidate\/} is assumed automatically suggested (even if there are no explicit BCP events to indicate this). Therefore, at most $C+1$ block candidates (including the null candidate) are suggested in a round.
\nxsubpoint\emb{Suggesting a block candidate}
A block candidate for the TON Block\-chain consists of two large ``files'' --- the block and the collated data, along with a small header containing the description of the block being generated (most importantly, the complete {\em block identifier\/} for the block candidate, containing the workchain and the shard identifier, the block sequence number, its file hash and its root hash) and the $\Sha$ hashes of the two large files. Only a part of this small header (including the hashes of the two files and other important data) is used as $\wcandidate$ in BCP events such as $\Submit$ or $\CommitSign$ to refer to a specific block candidate. The bulk of the data (most importantly, the two large files) is propagated in the overlay network associated with the catchain by the streaming broadcast protocol implemented over ADNL for this purpose (cf.~\cite[5]{TON}). This bulk data propagation mechanism is unimportant for the validity of the consensus protocol (the only important point is that the hashes of the large files are part of BCP events and hence of the catchain messages, where they are signed by the sender, and these hashes are checked after the large files are received by any participating nodes; therefore, nobody can replace or corrupt these files). A $\Submit(\wround,\wcandidate)$ BCP event is created in the catchain by the block producer in parallel with the propagation of the block candidate, indicating the submission of this specific block candidate by this block producer.
\nxsubpoint\emb{Processing block candidates}
Once a process observes a $\Submit$ BCP event in a delivered catchain message, it checks the validity of this event (for instance, its originating process must be in the list of designated producers, and current Unixtime must be at least the start of the round plus the minimum delay $\Delta_i$, where $i$ is the index of this producer in the list of designated producers), and if it is valid, remembers it in the current catchain state (cf.~\ptref{p:cc.state}). After that, when a streaming broadcast containing the files associated with this block candidates (with correct hash values) is received (or immediately, if these files are already present), the process invokes a validator instance to validate the new block candidate (even if this block candidate was suggested by this process itself!). Depending on the result of this validation, either an $\Approve(\wround,\wcandidate,\wsignature)$ or a $\Reject(\wround,\wcandidate)$ BCP event is created (and embedded into a new catchain message). Note that the $\wsignature$ used in $\Approve$ events uses the same private key that will ultimately be used to sign the accepted block, but the signature itself is different from that used in $\CommitSign$ (the hash of a structure with different magic number is actually signed). Therefore, this interim signature cannot be used to fake the acceptance of this block by this particular validator process to an outside observer.
\nxsubpoint\emb{Overview of one round}
Each round of BCP proceeds as follows:
\begin{itemize}
\item At the beginning of a round, several processes (from the predetermined list of designated producers) submit their block candidates (with certain delays depending on their producer priority) and reflect this fact by means of $\Submit$ events (incorporated into catchain messages).
\item Once a process receives a submitted block candidate (i.e., observes a $\Submit$ event and receives all necessary files by means external to the consensus protocol), it starts the validation of this candidate and eventually creates either an $\Approve$ or a $\Reject$ event for this block candidate.
\item During each {\em fast attempt\/} (i.e., one of the first $Y$ attempts) every process votes either for a block candidate that has collected the votes of more than $2/3$ of all processes, or, if there are no such candidates yet, for the valid (i.e., $\Approve$d by more than $2/3$ of all processes) block candidate with the highest priority. The voting is performed by means of creating $\Vote$ events (embedded into new catchain messages).
\item During each {\em slow attempt\/} (i.e., any attempt except the first $Y$) every process votes either for a candidate that was $\PreCommit$ted before (by the same process), or for a candidate that was suggested by $\VoteFor$.
\item If a block candidate has received votes from more than $2/3$ of all processes during the current attempt, and the current process observes these votes (which are collected in the catchain state), a $\PreCommit$ event is created, indicating that the process will vote only for this candidate in future.
\item If a block candidate collects $\PreCommit$s from more than $2/3$ of all processes inside an attempt, then it is assumed to be accepted (by the group), and each process that observes these $\PreCommit$s creates a $\CommitSign$ event with a valid block signature. These block signatures are registered in the catchain, and are ultimately collected to create a ``block proof'' (containing signatures of more than $2/3$ of the validators for this block). This block proof is the external output of the consensus protocol (along with the block itself, but without its collated data); it is ultimately propagated in the overlay network of all full nodes that have subscribed to new blocks of this shard (or of the masterchain).
\item Once a block candidate collects $\CommitSign$ signatures from more than $2/3$ of all validators, the round is considered finished (at least from the perspective of a process that observes all these signatures). After that, only a $\CommitSign$ can be added to that round by this process, and the process automatically starts participating in the next round (and ignores all events related to other rounds).
\end{itemize}
Note that the above protocol may lead to a validator signing (in a $\CommitSign$ event) a block candidate that was $\Reject$ed by the same validator before (this is a kind of ``submitting to the will of majority'').
\nxsubpoint\emb{$\Vote$ and $\PreCommit$ messages are created deterministically}\label{sp:force.vote}
Note that each process can create at most one $\Vote$ and at most one $\PreCommit$ event in each attempt. Furthermore, these events are completely determined by the state $\sigma_{D_m}$ of the sender of catchain message~$m$ containing such an event. Therefore, the receiver can detect invalid $\Vote$ or $\PreCommit$ events and ignore them (thus mitigating byzantine behavior of other participants). On the other hand, a message $m$ that should contain a $\Vote$ or a $\PreCommit$ event according to the corresponding state $\sigma_{D_m}$ but does not contain one can be received. In this case, the current implementation automatically creates missing events and proceeds as if $m$ had contained them from the very beginning. However, such instances of byzantine behavior are either corrected or ignored (and a message is output into the error log), but the offending processes are not otherwise punished (because this would require very large misbehavior proofs for outside observers that do not have access to the internal state of the catchain).
\nxsubpoint\emb{Multiple $\Vote$s and $\PreCommit$s of the same process}\label{sp:vote.fork}
Note that a process usually ignores subsequent $\Vote$s and $\PreCommit$s generated by the same originating process inside the same attempt, so normally a process can vote for at most one block candidate. However, it may happen that a ``good'' process indirectly observes a fork created by a byzantine process, with $\Vote$s for different block candidates in different branches of this fork (this can happen if the ``good'' process learns about these two branches from two other ``good'' processes that did not see this fork before). In this case, both $\Vote$s (for different candidates) are taken into account (added into the merged state of the current process). A similar logic applies to $\PreCommit$s.
\nxsubpoint\emb{Approving or rejecting block candidates}
Notice that a block candidate cannot be $\Approve$d or $\Reject$ed before it has been $\Submit$ted (i.e., an $\Approve$ event that was not preceded by a corresponding $\Submit$ event will be ignored), and that a candidate cannot be approved before the minimum time of its submission (the round start time plus the priority-dependent delay $\Delta_i$) is reached, i.e., any ``good'' process will postpone the creation of its $\Approve$ until this time. Furthermore, one cannot $\Approve$ more than one candidate of the same producer in the same round (i.e., even if a process $\Submit$s several candidates, only one of them---presumably the first one---will be $\Approve$d by other ``good'' processes; as usual, this means that subsequent $\Approve$ events will be ignored by ``good'' processes on receipt).
\nxsubpoint\emb{Approving the null block candidate}
The implicit null block candidate is also explicitly approved (by creating an $\Approve$ event) by all (good) processes, once the delay $\Delta_\infty$ from the start of the round expires.
\nxsubpoint\emb{Choosing a block candidate for voting}\label{sp:vote.rules}
Each process chooses one of the available block candidates (including the implicit null candidate) and votes for this candidate (by creating a $\Vote$ event) by applying the following rules (in the order they are presented):
\begin{itemize}
\item If the current process created a $\PreCommit$ event for a candidate during one of the previous attempts, and no other candidate has collected votes from more than $2/3$ of all processes since (i.e., inside one of the subsequent attempts, including the current one so far; we say that the $\PreCommit$ event is still {\em active\/} in this case), then the current process votes for this candidate again.
\item If the current attempt is fast (i.e., one of the first $Y$ attempts of a round from the perspective of the current process), and a candidate has collected votes from more than $2/3$ of all processes during the current or one of the previous attempts, the current process votes for this candidate. In the case of a tie, the candidate from the latest of all such attempts is chosen.
\item If the current attempt is fast, and the previous rules do not apply, then the process votes for the candidate with the highest priority among all {\em eligible candidates}, i.e., candidates that have collected $\Approve$s (observable by the current process) from more than $2/3$ of all processes.
\item If the current attempt is slow, then the process votes only after it receives a valid $\VoteFor$ event in the same attempt. If the first rule is applicable, the process votes according to it (i.e., for the previously $\PreCommit$ed candidate). Otherwise it votes for the block candidate that is mentioned in the $\VoteFor$ event. If there are several such valid events (during the current attempt), the candidate with the smallest hash is selected (this may happen in rare situations related to different $\VoteFor$ events created in different branches of a fork, cf.~\ptref{sp:vote.fork}).
\end{itemize}
The ``null candidate'' is considered to have the least priority. It also requires an explicit $\Approve$ before being voted for (with the exception of the first two rules).
\nxsubpoint\emb{Creating $\VoteFor$ events during slow attempts}
A $\VoteFor$ event is created at the beginning of a slow attempt by the {\em coordinator\/} --- the process with index $\wattempt\bmod N$ in the ordered list of all processes participating in the catchain (as usual, this means that a $\VoteFor$ created by another process will be ignored by all ``good'' processes). This $\VoteFor$ event refers to one of the block candidates (including the null candidate) that have collected $\Approve$s from more than $2/3$ of all processes, usually randomly chosen among all such candidates. Essentially, this is a suggestion to vote for this block candidate directed to all other processes that do not have an active $\PreCommit$.
\nxpoint\emb{Validity of BCP}
Now we present a sketch of the proof of validity of TON Block Consensus Protocol (BCP) described above in~\ptref{p:bcp.descr}, assuming that less than one third of all processes exhibit byzantine (arbitrarily malicious, possibly protocol-violating) behavior, as it is customary for Byzantine Fault Tolerant protocols. During this subsection, we consider only one round of BCP, subdivided into several attempts.
\nxsubpoint\emb{Fundamental assumption}\label{sp:fund.ass}
Let us emphasize once again that we assume that {\em less than one third of all processes are byzantine}. All other processes are assumed to be {\em good}, i.e., they follow the protocol.
\nxsubpoint\emb{Weighted BCP}
The reasoning in this subsection is valid for the {\em weighted variant of BCP} as well. In this variant, each process $i\in I$ is pre-assigned a positive weight $w_i>0$ (fixed in the genesis message of the catchain), and statements about ``more than $2/3$ of all processes'' and ``less than one third of all processes'' are understood as ``more than $2/3$ of all processes by weight'', i.e., ``a subset $J\subset I$ of processes with total weight $\sum_{j\in J}w_j>\frac{2}{3}\sum_{i\in I} w_i$'', and similarly for the second property. In particular, our ``fundamental assumption'' \ptref{sp:fund.ass} is to be understood in the sense that ``the total weight of all byzantine processes is less than one third of the total weight of all processes''.
\nxsubpoint\emb{Useful invariants}
We collect here some useful invariants obeyed by all BCP events during one round of BCP (inside a catchain). These invariants are enforced in two ways. Firstly, any ``good'' (non-byzantine) process will not create events violating these invariants. Secondly, even if a ``bad'' process creates an event violating these invariants, all ``good'' processes will detect this when a catchain message containing this event is delivered to BCP and ignore such events. Some possible issues related to forks (cf.~~\ptref{sp:vote.fork}) remain even after these precautions; we indicate how these issues are resolved separately, and ignore them in this list. So:
\begin{itemize}
\item There is at most one $\Submit$ event by each process (inside one round of BCP).
\item There is at most one $\Approve$ or $\Reject$ event by each process related to one candidate (more precisely, even if there are multiple candidates created by the same designated block producer, only one of them can be $\Approve$d by another process).\footnote{In fact, $\Reject$s appear only in this restriction, and do not affect anything else. Therefore, any process can abstain from sending $\Reject$s without violating the protocol, and $\Reject$ events could have been removed from the protocol altogether. Instead, the current implementation of the protocol still generates $\Reject$s, but does not check anything on their receipt and does not remember them in the catchain state. Only a message is output into the error log, and the offending candidate is stored into a special directory for future study, because $\Reject$s usually indicate either the presence of a byzantine adversary, or a bug in the collator (block generation) or validator (block verification) software either on the node that suggested the block or on the node that created the $\Reject$ event.} This is achieved by requiring all ``good'' processes to ignore (i.e., not to create $\Approve$s or $\Reject$s for) all candidates suggested by the same producer but the very first one they have learned about.
\item There is at most one $\Vote$ and at most one $\PreCommit$ event by each process during each attempt.
\item There is at most one $\VoteFor$ event during each (slow) attempt.
\item There is at most one $\CommitSign$ event by each process.
\item During a slow attempt, each process votes either for its previously $\PreCommit$ted candidate, or for the candidate indicated in the $\VoteFor$ event of this attempt.
\end{itemize}
One might somewhat improve the above statements by adding the word ``valid'' where appropriate (e.g., there is at most one {\em valid\/} $\Submit$ event\dots).
\nxsubpoint\emb{More invariants}\label{sp:more.inv}
\begin{itemize}
\item There is at most one eligible candidate (i.e., candidate that has received $\Approve$s from more than $2/3$ of all processes) from each designated producer, and no eligible candidates from other producers.
\item There are at most $C+1$ eligible candidates in total (at most $C$ candidates from $C$ designated producers, plus the null candidate).
\item A candidate may be accepted only if it has collected more than $2/3$ $\PreCommit$s during the same attempt (more precisely, a candidate is accepted only if there are $\PreCommit$ events created by more than $2/3$ of all processes for this candidate and belonging to the same attempt).
\item A candidate may be $\Vote$d for, $\PreCommit$ted, or mentioned in a $\VoteFor$ only if it is an {\em eligible candidate}, meaning that it has previously collected $\Approve$s from more than $2/3$ of all validators (i.e., a valid $\Vote$ event may be created for a candidate only if $\Approve$ events for this candidate have been previously created by more than $2/3$ of all processes and registered in catchain messages observable from the message containing the $\Vote$ event, and similarly for $\PreCommit$ and $\VoteFor$ events).
\end{itemize}
\nxsubpoint\emb{At most one block candidate is accepted}\label{sp:acc.unique}
Now we claim that {\em at most one block candidate can be accepted (in a round of BCP)}. Indeed, a candidate can be accepted only if it collects $\PreCommit$s from more than $2/3$ of all processes inside the same attempt. Therefore, two different candidates cannot achieve this during the same attempt (otherwise more than one third of all validators must have created $\PreCommit$s for two different candidates inside an attempt, thus violating the above invariants; but we have assumed that less than one third of all validators exhibit byzantine behavior). Now suppose that two different candidates $c_1$ and $c_2$ have collected $\PreCommit$s from more than $2/3$ of all processes in two different attempts $a_1$ and $a_2$. We may assume that $a_1<a_2$. According to the first rule of \ptref{sp:vote.rules}, each process that has created a $\PreCommit$ for $c_1$ during attempt $a_1$ must continue voting for $c_1$ in all subsequent attempts $a'>a_1$, or at least cannot vote for any other candidate, unless another candidate $c'$ collects $\Vote$s of more than $2/3$ of all processes during a subsequent attempt (and this invariant is enforced even if some processes attempt not to create these new $\Vote$ events for $c_1$, cf.~\ptref{sp:force.vote}). Therefore, if $c_2\neq c_1$ has collected the necessary amount of $\PreCommit$s during attempt $a_2>a_1$, there is at least one attempt $a'$, $a_1<a'\leq a_2$, such that some $c'\neq c_1$ (not necessarily equal to $c_2$) has collected $\Vote$s of more than $2/3$ of all processes during attempt $a'$. Let us fix the smallest such $a'$, and the corresponding $c'\neq c_1$ that has collected many votes during attempt $a'$. More than $2/3$ of all validators have voted for $c'$ during attempt $a'$, and more than $2/3$ of all validators have $\PreCommit$ted for $c_1$ during attempt $a_1$, and by the minimality of $a'$ there was no attempt $a''$ with $a_1<a''<a'$, such that a candidate distinct from $c_1$ collected more than $2/3$ of all votes during attempt $a''$. Therefore, all validators that $\PreCommit$ted for $c_1$ could vote only for $c_1$ during attempt $a'$, and at the same time we supposed that $c'$ has collected votes from more than $2/3$ of all validators during the same attempt $a'$. This implies that more than $1/3$ of all validators have somehow voted both for $c_1$ and $c'$ during this attempt (or voted for $c'$ while they could have voted only for $c_1$), i.e., more than $1/3$ of all validators have exhibited byzantine behavior. This is impossible by our fundamental assumption~\ptref{sp:fund.ass}.
\nxsubpoint\emb{At most one block candidate may be $\PreCommit$ted during one attempt}\label{sp:all.precomm.same}
Note that all valid $\PreCommit$ events (if any) created inside the same attempt must refer to the same block candidate, by the same reasoning as in the first part of~\ptref{sp:acc.unique}: since a valid $\PreCommit$ event for a candidate $c$ may be created only after votes from more than $2/3$ of all processes are observed for this candidate inside the same attempt (and invalid $\PreCommit$s are ignored by all good processes), the existence of valid $\PreCommit$ events for different candidates $c_1$ and $c_2$ inside the same attempt would imply that more than one third of all processes have voted both for $c_1$ and $c_2$ inside this attempt, i.e., they have exhibited byzantine behavior. This is impossible in view of our fundamental assumption~\ptref{sp:fund.ass}.
\nxsubpoint\emb{A previous $\PreCommit$ is deactivated by the observation of a newer one}\label{sp:new.precomm.deact}
We claim that {\em whenever a process with an active $\PreCommit$ observes a valid $\PreCommit$ created by any process in a later attempt for a different candidate, its previously active $\PreCommit$ is deactivated}. Recall that we say that a process has an {\em active $\PreCommit$} if it has created a $\PreCommit$ for a certain candidate $c$ during a certain attempt $a$, did not create any $\PreCommit$ during any attempts $a'>a$, and did not observe votes of more than $2/3$ of all validators for any candidate $\neq c$ during any attempts $a'>a$. Any process has at most one active $\PreCommit$, and if it has one, it must vote only for the precommitted candidate.
Now we see that if a process with an active $\PreCommit$ for a candidate $c$ since attempt $a$ observes a valid $\PreCommit$ (usually by another process) for a candidate $c'$ created during some later attempt $a'>a$, then the first process must also observe all dependencies of the message that contains the newer $\PreCommit$; these dependencies necessarily include valid $\Vote$s from more than $2/3$ of all validators for the same candidate $c'\neq c$ created during the same attempt $a'>a$ (because otherwise the newer $\PreCommit$ would not be valid, and would be ignored by the other process); by definition, the observation of all these $\Vote$s deactivates the original $\PreCommit$.
\nxsubpoint\emb{Assumptions for proving the convergence of the protocol}\label{sp:conv.ass}
Now we are going to prove that the protocol described above {\em converges\/} (i.e., terminates after accepting a block candidate) with probability one under some assumptions, which essentially tell us that there are enough ``good'' processes (i.e., processes that diligently follow the protocol and do not introduce arbitrary delays before sending their new messages), and that these good processes enjoy good network connectivity at least from time to time. More precisely, our assumptions are as follows:
\begin{itemize}
\item There is a subset $I^+\subset I$ consisting of ``good'' processes and containing more than $2/3$ of all processes.
\item All processes from $I^+$ have well-synchronized clocks (differing by at most $\tau$, where $\tau$ is a bound for network latency described below).
\item If there are infinitely many attempts, then infinitely many attempts are ``good'' with respect to network connectivity between processes from~$I^+$, meaning that all messages created by a process from $I^+$ during this attempt or earlier are delivered to any other process from $I^+$ within at most $\tau>0$ seconds after being created with probability at least $q>0$, where $\tau>0$ and $0<q<1$ are some fixed parameters, such that $5\tau<K$, where $K$ is the duration of one attempt.
\item Furthermore, if the protocol runs for infinitely many attempts, then any arithmetic progression of attempts contains infinitely many ``good'' attempts in the sense described above.
\item A process from $I^+$ creates a $\VoteFor$ during a slow attempt after some fixed or random delay after the start of the slow attempt, in such a way that this delay belongs to the interval $(\tau,K-3\tau)$ with probability at least $q'$, where $q'>0$ is a fixed parameter.
\item A process from $I^+$, when it is its turn to be the coordinator of a slow attempt, chooses a candidate for $\VoteFor$ uniformly at random among all eligible candidates (i.e., those candidates that have collected $\Approve$s from more than $2/3$ of all validators).
\end{itemize}
\nxsubpoint\emb{The protocol terminates under these assumptions}
Now we claim that {\em (each round of) the BCP protocol as described above terminates with probability one under the assumptions listed in~\ptref{sp:conv.ass}}. The proof proceeds as follows.
\begin{itemize}
\item Let us assume that the protocol does not converge. Then it continues running forever. We are going to ignore the first several attempts, and consider only attempts $a_0$, $a_0+1$, $a_0+2$, \dots\ starting from some $a_0$, to be chosen later.
\item Since all processes from $I^+$ continue participating in the protocol, they will create at least one message not much later than the start of the round (which may be perceived slightly differently by each process). For instance, they will create an $\Approve$ for the null candidate no later than $\Delta_\infty$ seconds from the start of the round. Therefore, they will consider all attempts slow at most $KY$ seconds afterwards. By choosing $a_0$ appropriately, we can assume that all attempts we consider are slow from the perspective of all processes from~$I^+$.
\item After a ``good'' attempt $a\geq a_0$ all processes from $I^+$ will see the $\Approve$s for the null candidate created by all other processes from~$I^+$, and will deem the null candidate eligible henceforth. Since there are infinitely many ``good'' attempts, this will happen sooner or later with probability one. Therefore, we can assume (increasing $a_0$ if necessary) that there is at least one eligible candidate from the perspective of all processes from $I^+$, namely, the null candidate.
\item Furthermore, there will be infinitely many attempts $a\geq a_0$ that are perceived slow by all processes from $I^+$, that have a coordinator from $I^+$, and that are ``good'' (with respect to the network connectivity) as defined in~\ptref{sp:conv.ass}. Let us call such attempts ``very good''.
\item Consider one ``very good'' slow attempt $a$. With probability $q'>0$, its coordinator (which belongs to $I^+$) will wait for $\tau'\in(\tau,K-3\tau)$ seconds before creating its $\VoteFor$ event. Consider the most recent $\PreCommit$ event created by any process from~$I^+$; let us suppose it was created during attempt $a'<a$ for some candidate $c'$. With probability $qq'>0$, the catchain message carrying this $\PreCommit$ will be already delivered to the coordinator at the time of generation of its $\VoteFor$ event. In that case, the catchain message carrying this $\VoteFor$ will depend on this $\PreCommit(c')$ event, and all ``good'' processes that observe this $\VoteFor$ will also observe its dependencies, including this $\PreCommit(c')$. We see that {\em with probability at least $qq'$, all processes from $I^+$ that receive the $\VoteFor$ event during a ``very good'' slow attempt receive also the most recent $\PreCommit$ (if any).}
\item Next, consider any process from $I^+$ that receives this $\VoteFor$, for a randomly chosen eligible candidate $c$, and suppose that there are already some $\PreCommit$s, and that the previous statement holds. Since there are at most $C+1$ eligible candidates (cf.~\ptref{sp:more.inv}), with probability at least $1/(C+1)>0$ we'll have $c=c'$, where $c'$ is the most recently $\PreCommit$ted candidate (there is at most one such candidate by~\ptref{sp:all.precomm.same}). In this case, all processes from $I^+$ will vote for $c=c'$ during this attempt immediately after they receive this $\VoteFor$ (which will be delivered to any process $j\in I^+$ less than $K-2\tau$ seconds after the beginning of the attempt with probability $qq'$). Indeed, if a process $j$ from $I^+$ did not have an active $\PreCommit$, it will vote for the value indicated in $\VoteFor$, which is $c$. If $j$ had an active $\PreCommit$, and it is as recent as possible, i.e., also created during attempt $a'$, then it must have been a $\PreCommit$ for the same value $c'=c$ (because we know about at least one valid $\PreCommit$ for $c'$ during attempt $a'$, and all other valid $\PreCommit$s during attempt $a'$ must be for the same $c'$ by~\ptref{sp:all.precomm.same}). Finally, if $j$ had an active $\PreCommit$ from an attempt $<a'$, then it will become inactive once the $\VoteFor$ with all its dependencies (including the newer $\PreCommit(c')$) has been delivered to this process~$j$ (cf.~\ptref{sp:new.precomm.deact}), and the process will again vote for the value $c$ indicated in $\VoteFor$. Therefore, all processes from $I^+$ will vote for the same $c=c'$ during this attempt, less than $K-2\tau$ seconds after the beginning of the attempt (with some probability bounded away from zero).
\item If there are no $\PreCommit$s yet, then the above reasoning simplifies further: all processes from~$I^+$ that receive this $\VoteFor$ will immediately vote for the candidate $c$ suggested by this $\VoteFor$.
\item In both cases, all processes from $I^+$ will create a $\Vote$ for the same candidate $c$ less than $K-2\tau$ seconds from the beginning of the attempt, and this will happen with a positive probability bounded away from zero.
\item Finally, all processes from $I^+$ will receive these $\Vote$s for $c$ from all processes from~$I^+$, again less than $(K-2\tau)+\tau=K-\tau$ seconds after the beginning of this attempt, i.e., still during the same attempt (even after taking into account the imperfect clock synchronization between processes from $I^+$). This means that they will all create a valid $\PreCommit$ for $c$, i.e., the protocol will accept $c$ during this attempt with probability bounded away from zero.
\item Since there are infinitely many ``very good'' attempts, and the probability of successful termination during each such attempt is $\geq p>0$ for some fixed value of $p$, the protocol will terminate successfully with probability one.
\end{itemize}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%
%
% bibliography
%
%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\clearpage
\markbothsame{\textsc{References}}
\begin{thebibliography}{2}
\bibitem{Birman}
{\sc K.~Birman}, {\sl Reliable Distributed Systems: Technologies, Web Services and Applications}, Springer, 2005.
\bibitem{PBFT}
{\sc M.~Castro, B.~Liskov, et al.}, {\sl Practical byzantine fault tolerance}, {\it Proceedings of the Third Symposium on Operating Systems Design and Implementation\/} (1999), p.~173--186, available at \url{http://pmg.csail.mit.edu/papers/osdi99.pdf}.
\bibitem{TON}
{\sc N.~Durov}, {\sl Telegram Open Network}, 2017.
\bibitem{TBC}
{\sc N.~Durov}, {\sl Telegram Open Network Blockchain}, 2018.
\bibitem{Byzantine}
{\sc L.~Lamport, R.~Shostak, M.~Pease}, {\sl The byzantine generals problem}, {\it ACM Transactions on Programming Languages and Systems}, {\bf 4/3} (1982), p.~382--401.
\bibitem{HoneyBadger}
{\sc A.~Miller, Yu Xia, et al.}, {\sl The honey badger of BFT protocols}, Cryptology e-print archive 2016/99, \url{https://eprint.iacr.org/2016/199.pdf}, 2016.
\bibitem{DistrSys}
{\sc M.~van Steen, A.~Tanenbaum}, {\sl Distributed Systems, 3rd ed.}, 2017.
\end{thebibliography}
\end{document}