-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathNAT_paper_ACM_short.tex
766 lines (656 loc) · 43.2 KB
/
NAT_paper_ACM_short.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
\documentclass{acm_proc_article-sp}
% http://www.acm.org/sigs/publications/proceedings-templates
% ------
% Fonts and typesetting settings
\usepackage[sc]{mathpazo}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
% \linespread{1.05} % Palatino needs more space between lines
\usepackage{microtype}
% ------
% Page layout
% \usepackage[hmarginratio=1:1,top=32mm,columnsep=20pt]{geometry}
\usepackage[font=it]{caption}
\usepackage{paralist}
% \usepackage{multicol}
% ------
% Lettrines
% \usepackage{lettrine}
% ------
% Abstract
% \usepackage{abstract}
% \renewcommand{\abstractnamefont}{\normalfont\bfseries}
% \renewcommand{\abstracttextfont}{\normalfont\small\itshape}
% ------
% Clickable URLs (optional)
\usepackage{hyperref}
% ------
% BibTex for bibliography
\usepackage[square,numbers]{natbib}
%\usepackage[square,sort&compress,authoryear]{natbib}
% ------
% Math
% \usepackage{amsthm}
\usepackage{amsmath, mathtools}
\usepackage{fixltx2e}
\usepackage{wrapfig}
% \usepackage{tikz}
\usepackage{float}
\usepackage{multirow}
%\usepackage{appendices}
\usepackage{subcaption}
\usepackage{algorithm}% http://ctan.org/pkg/algorithm
\usepackage{algpseudocode}% http://ctan.org/pkg/algorithmicx
\usepackage[compatibility=false]{caption}% http://ctan.org/pkg/caption
\newtheorem{mydef}{Definition}
\newcommand{\ignore}[1]{}
% \setlength{\parskip}{0cm plus4mm minus3mm}
%%%%%%%%%%%%%%%%%%%%%%%%
\begin{document}
\title{Traversing symmetric NAT with predictable port allocation}
\numberofauthors{1}
\author{
% 1st. author
\alignauthor
Du\v{s}an Klinec, Vashek Maty\'{a}\v{s}\\
\affaddr{Faculty of Informatics, Masaryk University}\\
\affaddr{Brno, Czech Republic}\\
\email{\{xklinec,matyas\}@fi.muni.cz}
% % 2nd. author
% \alignauthor
% G.K.M. Tobin\titlenote{The secretary disavows
% any knowledge of this author's actions.}\\
% \affaddr{Institute for Clarity in Documentation}\\
% \affaddr{P.O. Box 1212}\\
% \affaddr{Dublin, Ohio 43017-6221}\\
% \email{webmaster@marysville-ohio.com}
% % 3rd. author
}
\maketitle
\begin{abstract}
\noindent Network Address Translators often cause trouble for VoIP and other P2P services since central servers are needed for communication.
The presence of such potentially malicious hosts in a communication path is not desired, mainly due to security consequences, poor link quality
and increased cost.
Several solutions exist for traversing NAT, but a symmetric one is still problematic. We propose algorithms using a single source port for
symmetric NAT traversal. Each with different properties and applicability.
% Our paper investigates the NAT traversal problem by stating the problem formally and modeling a network connected to
% NAT as a random process, in order to evaluate algorithms in different settings.
\end{abstract}
\section{Introduction}
Network Address Translators (NATs) have been increasing in popularity in recent years,
mainly as a means of slowing down IPv4 address space depletion. On one hand, NAT offers
the ability to share a public, globally routable IP address among many internal network hosts. On
the other hand, it breaks the basic end-to-end principle of the Internet and thus favors server-client
communication model for clients connected behind NAT.
In many applications, a peer-to-peer (P2P) connection is of crucial importance and the quality of a service depends on
its parameters heavily. Main motivation for this work is Voice-over-IP (VoIP) where direct connection
between communicating parties is necessary. If P2P connection is not possible to establish, relay
servers have to be used (TURN~\citep{rfc5766} protocol is used). The presence
of another node in the communication path can have negative effects on the communication channel quality
such as increased latency, jitter, lower bandwidth. The relay server is a potential single point of failure and
has to scale with increasing demands of the network, etc.
% Thus relay servers require non-negligible resources and become expensive infrastructure components for VoIP and other service
% providers where establishing a direct connection between parties is of high importance.
In this paper, we focused on a symmetric NAT with a predictable port allocation function.
The contributions of this paper are as follows:%
% \begin{compactitem}
% \item
a) we propose new effective algorithms for establishing a direct connection between hosts behind such NATs; %
% \item
b) we demonstrate the implementations of these algorithms; %, together with a simulation based on a theoretical approach from a queueing theory. These implementations we provide in public domain.%
% \item
c) we tested our algorithms using data from a real network; % (of our mobile phone operator).%
% \item
d) last but not least, we provide a formalization of the NAT traversal problem.
% \end{compactitem}
% We propose
% new algorithms for establishing direct connection between hosts behind such NATs that are effective and
% have low system requirements.
\section{Existing techniques}
Many different solutions for traversing non-symmetric NATs exist, e.g., NAT traversal in Internet Key Exchange~\citep{rfc3947} or
Interactive Connectivity Establishment~(ICE)~\citep{rfc5245}. %Terredo\ignore{tunneling}~\citep{rfc4380}.
Some devices with NAT functionality provide special interface enabling traversal, for example NAT Port Mapping Protocol~\citep{rfc6886}. However, in general,
it is not widely deployed and it cannot be relied upon in a general case.
Another solution is ICE\ignore{~\citep{rfc5245}}. It is used as a technique to traverse NAT in most applications nowadays
although it does not work with symmetric NATs. ICE mainly uses STUN~\citep{rfc5389} and TURN\ignore{~\citep{rfc5766}} for its operation.
% \section{Related work}
Interesting algorithms for traversing symmetric NAT were proposed by Y.~Takeda in \citep{takeda}, although does not cover the case with both clients using symmetric NAT
with a port sensitive allocation function. Takeda mentions that sending more packets does not increase a probability of establishing a new connection.
We show that it is actually possible. Wang~\emph{et~al.} \citep{Wang:2006:RSN:1156422.1156550} propose a NAT traversal algorithm for the scenario where two hosts are
behind symmetric NATs, based on changing of a source port, while we explore options with a constant source port.
RFC~5128 \citep{rfc5128} maps current state of the art in NAT traversal, but does not cover our case.
The research by Wei~\emph{et~al.} in \citep{wei} is devoted to symmetric NAT traversal, but their algorithm is using a different from ours.
Disadvantage of their approach comes with the Time To Live~(TTL) parameter modification of the IP~packet. This is usually considered
as a privileged operation in the operating system and super-user privileges are required for this modification.
In our work we study the situation where two parties, $A$ and $B$, try to establish a direct connection, both of them are behind
a symmetric NAT (possibly a cascade) with a port sensitive, predictable allocation function.
%Other scenarios with predictable allocation function are quite easy to solve and covered in previous research.
\section{Network Address Translation}
The notation used for the rest of our paper follows.
% \paragraph{Notation.}
% Symmetric NAT could be defined as follows.
\begin{mydef}
% \begin{align*}
$\mathbb{SC} = \mathbb{IP} \times \mathbb{PORT}$
% \end{align*}%
is a Socket. $\mathbb{IP}$ is the set of all possible IP addresses and $\mathbb{PORT}$
is the set of all possible ports. %, basically $\mathbb{PORT} = \{1,2,\dots,65535\}$.
% \begin{align*}
$\mathbb{SP} = \mathbb{IP} \times \mathbb{PORT} \times \mathbb{IP} \times \mathbb{PORT}$
% \end{align*}%
is a Socket Pair.
\end{mydef}
%
% \begin{mydef}
% \begin{align*}
% \mathbb{AT} \subseteq \mathbb{SP} \times \mathbb{PORT}
% \end{align*} is an allocation table of the NAT. Second component is external port allocated for particular socket pair.
% \end{mydef}
% Allocation table is a set of all socket pairs and corresponding allocated external port.
% \paragraph{NAT and communication.}
\subsection{NAT and communication}
NAT connects two networks, denoted as {\em internal} and {\em external}, connected to an internal and an external NAT interface, respectively.
Hosts have addresses $\text{IP\textsubscript{in}} \in \mathbb{IP}_{in} \subseteq \mathbb{IP}$
and $\text{IP\textsubscript{ex}} \in \mathbb{IP}_{ex} \subseteq \mathbb{IP}$ in the internal and the external network,
respectively, where $\mathbb{IP}_{in} \cap \mathbb{IP}_{ex} = \emptyset$.
Unless it is not stated otherwise the external NAT address\footnote{We assume NAT has only one external address for simplicity.} is denoted as
$\text{IP\textsubscript{nat}} \in \mathbb{IP}_{ex}$.
Packet received on the external interface has the socket pair {(IP\textsubscript{ex},PORT\textsubscript{ex},IP\textsubscript{nat},PORT\textsubscript{nat})}.
Packet received on the internal interface has the socket pair {(IP\textsubscript{in},PORT\textsubscript{in},IP\textsubscript{ex},PORT\textsubscript{ex})}.
% \paragraph{Filtering rule.} Let assume packet with socket pair {(IP\textsubscript{out},PORT\textsubscript{out},IP\textsubscript{ex},PORT\textsubscript{ex})}
% received on NAT external interface (we'll use this notation in further text). Packet is allowed to pass to internal network iff
% \begin{align*}
% & \exists \; \text{IP\textsubscript{in}} \in \mathbb{IP}, \; \text{PORT\textsubscript{in}} \in \mathbb{PORT}:\\
% & ((\text{IP\textsubscript{in}}, \text{PORT\textsubscript{in}}, \text{IP\textsubscript{out}}, \text{PORT\textsubscript{out}}), \text{PORT\textsubscript{ex}} ) \in \mathbb{AT}
% \end{align*}
%
% If packet is allowed to pass, it is then forwarded to host {(IP\textsubscript{in}, PORT\textsubscript{in})}. A new record
% to the allocation table is added when internal host sends packet to the external host.
%
% \paragraph{Allocation rule.} ToDo
\subsection{NAT classes}\label{sec:nat}
NAT is classified with respect to a \emph{filtering rule} and a \emph{allocation rule}, which state what happens after receiving
packet from an external and an internal network, respectively.
Common NAT categorization is according to~\citep{rfc3489}: a) Full Cone;
b) IP Restricted Cone; c) Port Restricted Cone; d) Symmetric.
% MOVED TO APPENDIX
% \paragraph{Full cone NAT.} ~\\
% Allocation table $\mathbb{AT} \subseteq \{ \mathbb{IP}_{in} \times \mathbb{PORT} \} \times \mathbb{PORT}$ holds associations between internal
% hosts socket and mapped external NAT port.
% % Rule is created once the connection from the internal host is initiated to an arbitrary external address.
% Filtering rule is:
% \begin{align*}
% & PASS((\text{IP\textsubscript{ex}}, \text{PORT\textsubscript{ex}}, \text{IP\textsubscript{nat}}, \text{PORT\textsubscript{nat}} )) \Leftrightarrow \\
% & \exists \; \text{IP\textsubscript{in}} \in \mathbb{IP}_{in}, \; \text{PORT\textsubscript{in}} \in \mathbb{PORT}:\\
% & ((\text{IP\textsubscript{in}}, \text{PORT\textsubscript{in}}), \text{PORT\textsubscript{nat}} ) \in \mathbb{AT}
% \end{align*}
%
% \paragraph{Address restricted cone NAT.} ~\\
% Allocation table $\mathbb{AT} \subseteq \{\mathbb{IP}_{in} \times \mathbb{PORT} \times \mathbb{IP}_{ex}\} \times \mathbb{PORT}$
% holds associations between internal hosts socket, external host IP address and mapped external NAT port.
% % Rule is created once the connection from the internal host is initiated to an arbitrary external address.
% Filtering rule is:
% \begin{align*}
% & PASS((\text{IP\textsubscript{ex}}, \text{PORT\textsubscript{ex}}, \text{IP\textsubscript{nat}}, \text{PORT\textsubscript{nat}} )) \Leftrightarrow \\
% & \exists \; \text{IP\textsubscript{in}} \in \mathbb{IP}, \; \text{PORT\textsubscript{in}} \in \mathbb{PORT}:\\
% & ((\text{IP\textsubscript{in}}, \text{PORT\textsubscript{in}}, \text{IP\textsubscript{ex}}), \text{PORT\textsubscript{nat}} ) \in \mathbb{AT}
% \end{align*}
% \paragraph{Port Restricted Cone NAT} ~\\
\par\smallskip
\noindent\textbf{Port Restricted Cone NAT.}
Allocation table $\mathbb{AT} \subseteq \{\mathbb{IP}_{in} \times \mathbb{PORT} \times \mathbb{IP}_{ex} \times \mathbb{PORT}\} \times \mathbb{PORT}$
holds associations between the internal host socket, external host socket and mapped external NAT port.
% Rule is created once the connection from the internal host is initiated to an arbitrary external address.
The filtering rule is:
\begin{align*}
& PASS((\text{IP\textsubscript{ex}}, \text{PORT\textsubscript{ex}}, \text{IP\textsubscript{nat}}, \text{PORT\textsubscript{nat}} )) \Leftrightarrow \\
& \exists \; \text{IP\textsubscript{in}} \in \mathbb{IP}, \; \text{PORT\textsubscript{in}} \in \mathbb{PORT}:\\
& ((\text{IP\textsubscript{in}}, \text{PORT\textsubscript{in}}, \text{IP\textsubscript{ex}}, \text{PORT\textsubscript{ex}}), \text{PORT\textsubscript{nat}} ) \in \mathbb{AT}
\end{align*}
% \paragraph{Symmetric NAT} ~\\
\par\smallskip
\noindent\textbf{Symmetric NAT.}
Allocation table $\mathbb{AT}$ and filtering rule are the same as in the Port Restricted Cone NAT. The difference lies in the way
the new entries in the allocation table are created. For the Cone NAT holds that multiple entries in the allocation table
can have the same external port assigned, while the Symmetric NAT creates new allocation entries under certain circumstances.
\emph{Allocation function (AF)} defines port allocation rules for new connections (not in $\mathbb{AT}$). Assume $\mathbb{AT}=\emptyset$, for simplicity.
AF can be either \emph{predictable} or \emph{random}. In our setting, a predictable AF is assumed, e.g., \emph{incremental}.
If the previously allocated port was $\text{PORT\textsubscript{nat}}$ then
the next port will be $\text{PORT\textsubscript{nat}}+\Delta$, where typically $\Delta=1$.
If the chosen port is already taken, AF iterates until a free one is found.
AF can have different \emph{sensitivity} determining the cases where new allocation is created or the previous one is used instead.
\emph{Address sensitive allocation function} creates a~new allocation if a destination address differs from the existing allocation
from the same source:
\begin{align*}
& \exists s=(a_1,a_2,a_3,a_4) \in \mathbb{SP}, p \in \mathbb{PORT}: \\
& (s, p) \in \mathbb{AT} \Rightarrow ((\forall s^{\prime}=(a_1^{\prime},a_2^{\prime},a_3^{\prime},a_4^{\prime}) \in \mathbb{SP}, \\
& a_1 \neq a_1^{\prime}, a_2 \neq a_2^{\prime}, a_3 \neq a_3^{\prime}, \forall p^{\prime} \in \mathbb{PORT}: \\
& ((s^{\prime}, p^{\prime}) \in \mathbb{AT})) \Rightarrow p \neq p^{\prime})
\end{align*}
\emph{Port sensitive allocation function} creates a new allocation if a destination address and a port differs from the existing
allocations from the same source:
\begin{align*}
& \exists s \in \mathbb{SP}, p \in \mathbb{PORT}: (s, p) \in \mathbb{AT} \Rightarrow \\
& \Rightarrow ((\forall s^{\prime} \in \mathbb{SP}, s^{\prime} \neq s , \forall p^{\prime} \in \mathbb{PORT}: \\
& ((s^{\prime}, p^{\prime}) \in \mathbb{AT})) \Rightarrow p \neq p^{\prime})
\end{align*}
%
% PASS((\text{IP\textsubscript{out}}, \text{PORT\textsubscript{out}}, \text{IP\textsubscript{ex}}, \text{PORT\textsubscript{ex}} )) \Leftrightarrow \\
% & \exists \; \text{IP\textsubscript{in}} \in \mathbb{IP}, \; \text{PORT\textsubscript{in}} \in \mathbb{PORT}:\\
% & ((\text{IP\textsubscript{in}}, \text{PORT\textsubscript{in}}, \text{IP\textsubscript{out}}, \text{PORT\textsubscript{out}}), \text{PORT\textsubscript{ex}} ) \in \mathbb{AT}
% \end{align*}
% \section{Related work II}
% \begin{compactitem}
% \item chineese\citep{Wang:2006:RSN:1156422.1156550}
% \item uPNP
% \item PCP~\citep{rfc6887}
% \end{compactitem}
\section{UDP hole punching with STUN}
All algorithms in our paper make use of the same principle, \emph{UDP hole punching}.
A public internet host cannot send packets to the private host behind NAT/firewall directly.
Communication has to be initiated from the inside\footnote{Not taking
pre-set port forwarding, DMZ, uPnP, etc. into consideration. }, what is problematic if both
communicating sides are private.
Assume two parties, A and B with addresses IP\textsuperscript{A} and IP\textsuperscript{B}, respectively.
A and B are behind a public NAT with addresses $\text{IP}^{\text{A}}_{\text{nat}}$ and $\text{IP}^{\text{B}}_{\text{nat}}$,
respectively. By default an internal host does not know its external address or NAT type, thus a technique like
STUN\footnote{Core idea: publicly available service running on 2 IP addresses and 2 ports
answering questions like: what is my IP address and port? Change IP address of response if Il get it, etc.}
is used to learn it. In case of incremental NAT it is also necessary to determine $\Delta$ and the next port
that is likely to be allocated for a new connection $\text{PORT}^{\text{A}}_{\text{nat}}$ for A and $\text{PORT}^{\text{B}}_{\text{nat}}$
for B. The period between determining this information, sending it to the other party and starting the algorithm is
denoted as an \emph{initialization phase} or a \emph{silent period}\ignore{\footnote{Since the algorithm itself does nothing.}} of the NAT traversal algorithm.
UDP hole punching works as follows:\\
\begin{compactitem}
\item [1.] $A: \; IP^A_{in}:PORT^A_{in} \longrightarrow IP^B_{nat}:PORT^B_{nat}$ \\
Locally a~new mapping is created ($PORT^A_{nat}$), packet is dropped on B's~NAT.
\item [2.] $B: \; IP^A_{nat}:PORT^A_{nat} \longleftarrow IP^B_{in}:PORT^B_{in}$ \\
Locally a~new mapping is created ($PORT^B_{nat}$), packet reaches A using allocation created in step~1.
\item [3.] $A: \; IP^A_{in}:PORT^A_{in} \longrightarrow IP^B_{nat}:PORT^B_{nat}$ \\
Packet reaches B using allocation created in in step~2.
\end{compactitem}
% By default a packet
% coming from the internet (\textit{IP\textsubscript{A},Port\textsubscript{A}}) to the internal
% network is dropped on the NAT unless there is a valid mapping in NAT allocation table.
\section{Theoretical network model}
In order to evaluate algorithms in multiple environments with different parameters (workload, number of clients, etc.)
we decided to model network behavior with a stochastic process. In particular, we chose a \emph{Poisson process}, $\{N(t), t\geq0\}$
that is used in queueing theory to model arrivals of requests to the server queue~\citep{Nelson:1995:PSP:207382}.
The core idea of NAT traversal is to predict the next allocated external port by NAT, thus from this perspective
it is important to model how the NAT internal state changes over time, i.e., how many external ports
were allocated in a given time interval. Thus the workload is an important parameter for predicting the allocated port.
We model the internal network connected to NAT with respect to newly created connections (a new port allocation) as a
\emph{time homogeneous Poisson process} $Po(\lambda)$. Value of $\lambda$ models
different workloads of the network. This way we can evaluate algorithms with respect to various parameters, e.g.,
probability of establishing a connection.
% Random variable $X_t \sim Po(\lambda t)$ says how many
% port allocations were made in time interval $[0,t]$
\begin{mydef}
$N(t)$ is a random variable of a number of new port allocations made in time
interval~$[0,t]$ where $N(t) \sim Po(\lambda t)$.
\begin{center}
Then $P[N(t)=n] = \frac{(\lambda t)^n}{n!} e^{-\lambda t}$
\end{center}
\end{mydef}
Moreover, we assume an incremental NAT with $\Delta=1$ (without loss of generality) on both sides. If we sample the NAT
state\footnote{STUN requests to determine current port number, it creates a new NAT allocation record.}
each $T=10$~milliseconds the NAT state can be modeled as a random process $\{C_i | C_i \in \mathbb{N}, C_i \geq i, i\geq0\}$ where:
\[
C_i = \begin{dcases*}
0 & if $i=0$ \\
C_{i-1} + 1 + X_i, \; X_i \sim Po(\lambda T) & otherwise
\end{dcases*}
\]
Rewritten as $C_i = i + \sum_{j=1}^{i}X_j$. Expected value $E[C_i] = i + E[\sum_{j=1}^{i}X_i] = i (1+\lambda T)$.
% \paragraph{Port pool exhaustion.}
\par\smallskip
\noindent\textbf{Port pool exhaustion.} NAT timeout\footnote{Allocation is deleted if no packet is detected within this time interval.} is assumed to be 3~minutes,
what is a typical setting in practice. This constrains $\lambda$ to the interval $[0, 0.36]$, intuitively from $0$ to $360$ new connections in $1$~sec on average.
If it is higher then NAT is rendered as unusable since the port pool is exhausted\footnote{$P[X > 65535] = 0.0019, X \sim Po(180000 \cdot 0.36)$.} quickly.
\section{Generic algorithm structure}
Algorithms that we propose have a common structure. In order to maximize the probability of establishing
a connection (i.e., success rate) it proceeds in discrete \emph{steps}. In each step it
tries to open a new hole in the local NAT and to predict the corresponding port allocated in the destination
NAT.
\begin{mydef}
The symmetric NAT traversal algorithm is a mapping
$\mathcal{A}: \mathcal{U} \times \mathcal{S} \times \mathcal{M} \rightarrow \mathcal{P}_{src} \times \mathcal{P}_{dst}$,
where
\begin{compactitem}
\item $\mathcal{U}=\{0,1\}$ is the set of communicating parties.
\item $\mathcal{S} = \{0, 1, \dots, n-1\}, \; n \in \mathbb{N}$ is a step of the algorithm.
\item $\mathcal{M}$ is the set of vectors of parameters of the environment (i.e., network model
parameters and additional information obtained in initial phase).
In our setting $\mathcal{M} = \{ (\lambda, t_s) $ $| \lambda~>~0, t\geq~0 \}$. Value $t_s$ denotes the length of the silent period. %For simplicity $\mathcal{M} = \{ \lambda | \lambda > 0 \}$.
\item $\mathcal{P}_{src} \subseteq \mathbb{N}$ and $\mathcal{P}_{dst} \subseteq \mathbb{N}$
are sets of source and destination ports respectively.
\end{compactitem}
\end{mydef}
If $\mathcal{M}$ is not important for the algorithm it can be ommitted in the notation.
%If the third parameter is not important for the algorithm it can be ignored in the notation.
Prior algorithm run it is needed to
determine the NAT state (for UDP hole punching), this is assumed to be done by a technique like STUN. For simplicity, we assume that port numbers
start from $0$. The length of the silent period is also known to the algorithm prior execution in order to estimate the current state of
the NAT, e.g., difference between the last measurement and just before actual algorithm execution.
% \paragraph{Notation.}
\par\smallskip
\noindent\textbf{Notation.}
For instance, line $A(0,1) = (2,3)$ means that party $0$ in step $1$ sent packet from source port $2$ to destination port $3$
of the party~$1$.
% \paragraph{Time.}
\par\smallskip
\noindent\textbf{Time.}
%The algorithm proceeds in discrete time steps.
Algorithm starts in time $t_0 = 0$~ms and each step $i \in \mathcal{S}$
happens in time $t_i = iT$ for some $T$. If not mentioned otherwise, $T = 10$~ms.
% \paragraph{Algorithm log \& successful run.}
\par\smallskip
\noindent\textbf{Algorithm log \& successful run.}
IP addresses of the hosts and NAT are known and fixed prior to the algorithm run, and are thus omitted from the notation for simplicity. For the following text we assume the maximum number of steps of the algorithm $s_{max}=1000$.
\begin{mydef}
\label{def:log}
Algorithm log $\mathcal{L}^A_{\mathcal{A}}$ is a set of tuples $(p_1, p_2, p_3) \in \mathcal{L}^A_{\mathcal{A}}$ produced
by the NAT traversal algorithm $\mathcal{A}$ execution on the host $A$, where: \\
%$\mathcal{A}(0, s, (\lambda, t_s)) = (p_1, p_2)$
%are generated by $\mathcal{A}$ for some parameters $\lambda, t_s$ in some step $s$. The $p_3$ is
%an allocated port by $NAT^A$ for request $(p_1, p_2)$ in a given step. Algorithm log for host $B$ is defined analogically.
%Furthermore from our setting the
%following formula holds:\footnote{From definition of a port sensitive symmetric NAT.} %$\mathcal{L^{A}} \subseteq PORT_{in}^{A} \times PORT_{nat}^{B} \times PORT_{nat}^{A}$,
\begin{compactitem}
\item $(p_1, p_2) = \mathcal{A}(0, s, (\lambda, t_s)) $ are generated by the algorithm $\mathcal{A}$ for
parameters $(\lambda, t_s)$ in some step $s$ of the algorithm $\mathcal{A}$;
\item $p_3$ is an allocated port by $NAT^A$ for request $(p_1, p_2)$ in a given step $s$. In our network model
$p_3=C_s=C_{s-1} + 1 + X, X~\sim~Po(\lambda T)$;
\item algorithm log for host $B$ is defined analogically;
\item furthermore from our setting the following formula holds:\footnote{From definition of a port sensitive symmetric NAT.}
\end{compactitem} %
%
% \begin{compactitem}
% \item $p_1$ is a source port.
% \item $p_2$ is a destination port.
% \item $p_3$ is an allocated port by local NAT for this request.
% \item following formula holds\footnote{from definition of a port sensitive symmetric NAT}
% \end{compactitem}
%
\begin{align*}
& (p_1, p_2, p_3) \in \mathcal{L}^A_{\mathcal{A}}: \forall (p_1^{\prime}, p_2^{\prime}, p_3^{\prime}) \in \mathcal{L}^{A}_{\mathcal{A}} :\\
& p_3 = p_3^{\prime} \Leftrightarrow \left( p_1=p^{\prime}_1 \wedge p_2=p^{\prime}_2 \right)
\end{align*}
\end{mydef}
\begin{mydef}
\label{def:match}
Algorithm $\mathcal{A}$ run is successful, i.e., established a direct connection between $A,B$ if~\footnote{Derived from UDP hole punching.}:
\begin{align*}
& MATCH(\mathcal{L}^{A}_{\mathcal{A}}, \mathcal{L}^{B}_{\mathcal{A}}) \Leftrightarrow \exists(p_1, p_2, p_3) \in \mathcal{L}^{A}_{\mathcal{A}}: \\
& \exists(p_1^{\prime}, p_2^{\prime}, p_3^{\prime}) \in \mathcal{L}^{B}_{\mathcal{A}}: p_2 = p_3^{\prime} \wedge p_2^{\prime} = p_3.
%
% \in \mathcal{L}^A_{\mathcal{A}}: \forall (p_1^{\prime}, p_2^{\prime}, p_3^{\prime}) \in \mathcal{L}^{A}_{\mathcal{A}} :\\
%& p_3 = p_3^{\prime} \left( \Leftrightarrow p_1=p^{\prime}_1 \wedge p_2=p^{\prime}_2 \right)
\end{align*}
\end{mydef}
The task is to find an algorithm $\mathcal{A}$ such that the probability that $MATCH(\mathcal{L}^{A}_{\mathcal{A}}, \mathcal{L}^{B}_{\mathcal{A}})$
holds is maximized for minimal number of steps needed.
\subsection{Algorithm I -- baby-step, giant-step}
We propose the algorithm with high success rate for small network loads, i.e.,
$\lambda \leq 0.03, T=10$~ms. When the error becomes larger, the
algorithm step is not fast enough to compensate errors and thus fails for a larger $\lambda$.
\begin{mydef}
Baby-step, giant-step
\[
u \in \mathcal{U}, s \in \mathcal{S}:
\mathcal{A}(u, s) = \begin{dcases*}
(0, s) & \text{if} $u=0$\\
(0, 2s) & \text{if} $u=1$
\end{dcases*}
\]
\end{mydef}
Intuitively, the source port is kept constant, one side of the protocol scans ports sequentially, another
side of the protocol scans ports with step 2.
\subsection{Algorithm II -- fixed destination port}
Algorithm with fixed destination port described in by \citep{Wang:2006:RSN:1156422.1156550}
changes the source port in order to create new holes in NAT in each step.
\begin{mydef}
Fixed destination port algorithm
\[
u \in \mathcal{U}, s \in \mathcal{S}: \;
\mathcal{A}(u, s) = (s, \Delta_u)
\] for some constants $\Delta_0, \Delta_1$ based on the current network workload determined in initial phase of the protocol.
\end{mydef}
\subsection{Algorithm III -- expected port value}
Some more effective algorithms can be designed if the random process the network follows is known (and their
parameters).
\begin{mydef}
Expected port value algorithm
\begin{align*}
& u \in \mathcal{U}, s \in \mathcal{S}, (\lambda, t_s) \in \mathcal{M}:\\
& \mathcal{A}(u, s, (\lambda, t_s)) = (0, \lambda t_s + s(1 + \lambda T))
\end{align*}
\end{mydef}
In step $i$ the algorithm tries the expected value for $C_i$. Since the $C_i$ has unimodal distribution,
the expected value is also the most probable value and the algorithm uses maximum likelihood approach.
% \newpage
\subsection{Algorithm IV -- Poisson sampling}
Algorithm with the best success rate in our simulation is based on sampling Poisson distribution.
The main idea is to let the algorithm simulate the Poisson process that the network of the
other party follows. Duplicates are removed since it does not lead to a new port allocation.
\begin{mydef}
Poisson sampling algorithm
% \captionof{algorithm}{Euclid’s algorithm}\label{algo1}
\begin{algorithmic}[1]
\algtext*{EndWhile}% Remove "end while" text
\algtext*{EndIf}% Remove "end if" text
\algtext*{EndFunction}% Remove "end if" text
\newcommand{\LineIf}[2]{ \State \algorithmicif\ {#1}\ \algorithmicthen\ {#2}}% \algorithmicend\ \algorithmicif }
\Function{init4}{$\lambda, T$} %\Comment{The g.c.d. of a and b}
\State $B\gets [], s \gets 0$
%\State $s\gets 0$
\While{$len(B)< s_{max}$}
\State $x\gets \text{P}(\lambda T (1+s C(\lambda T))$
\State $s\gets s+1$
\LineIf{$x \notin B$}{$B.append(x)$} %\If {$x \notin B$} \State $B.append(x)$ \EndIf
\EndWhile
\State \textbf{return} $B$
%\Return $B$
\EndFunction\\
\State $M\gets init4(\lambda,T)$\\
\State $u \in \mathcal{U}, s \in \mathcal{S}, (\lambda, t_s) \in \mathcal{M}:$
% \State $\mathcal{A}(u, s, (\lambda, t_s)) = (0, \lambda t_s + M[s])$
\Function{A}{$u, s, (\lambda, t_s)$} %\Comment{The g.c.d. of a and b}
%\LineIf{$s = 0$}{$M = init4(\lambda,T)$} %\If {$x \notin B$} \State $B.append(x)$ \EndIf
\State \textbf{return} $(0, \lambda t_s + M[s])$
\EndFunction
\end{algorithmic}
%\begin{align*}
%& u \in \mathcal{U}, s \in \mathcal{S}, (\lambda, t_s) \in \mathcal{M}:\\
%& \mathcal{A}(u, s, (\lambda, t_s)) = (0, \lambda t_s + \text{P}(\lambda T (1+s C(\lambda T)))
%\end{align*}
where $\text{P}(\mu) = X \sim Po(\mu)$, samples Poisson distribution with
parameter $\mu$, coefficient function $C(\lambda T)$ gives a constant
coefficient\footnote{This function was found empirically by maximizing the success rate in simulations.}.
Approximations of the $C(\lambda T)$ by an inverse logarithm and a quartic model:
\begin{align*}
C(x) = & ({0.163321 \cdot ln(64.2568 \cdot x)})^{-1}\\
C(x) = & (0.172876+1.28162\cdot x-1.41256\cdot x^2 \\
& +0.825093\cdot x^3-0.184726\cdot x^4)^{-1}
\end{align*}
\end{mydef}
%
\subsection{Silent period in algorithms} If the length of the silent period is too big then
the algorithm can face problems with a gap that occurred in NAT ports. Thus
Algorithms III and IV estimate the starting offset caused by the silent period as an
expected value $E[Po(\lambda \cdot t_1)] = \lambda\cdot t_1$.
% \vspace*{-8mm}
%\parskip=0pt
% \titlespacing{\subsection}{0pt}{*0}{*0}
\section{Evaluation}
\subsection{Comparison of Algorithms I and II}
Algorithm~I has the following benefits over~II:
\begin{compactitem}
\item Works with high success rate for $\lambda \in [0, 0.035] \sim 35$ new connections in 1~second on average.
\item Can be stopped as soon as the connection is established, overhead is proportional to~$\lambda$.
\item Single source port is used, only one listening thread is needed to implement
it\footnote{NAT is abused to multiplex multiple connections to one source port.} compared to Alg. II.
Due to this fact a practical implementation is simple and lightweight with respect to required system resources.
\item Is $\lambda$-invariant. Provided network load is low and no additional measurements are needed.
\end{compactitem}
% \paragraph{Simulation.}
\subsection{Simulation}
\begin{figure}%[H]
{\scalebox{0.48}{\includegraphics[trim=40 20 0 0]{alg_1.pdf}}}
\caption{Algorithm evaluation in a simulation, $T=10~ms$. Number of a simulation rounds for one parameter setting is 1000.
A, B, C correspond to the Algorithm I, II and III respectively.
D, E correspond to the Algorithm IV with optimized $C(\lambda T)$ by an exhaustive search in D case,
in E case $C(\lambda T)$ is approximated by a polynomial.}
\label{fig:alg_eval}
\end{figure}
\begin{figure}%[H]
{\scalebox{0.48}{\includegraphics[trim=40 20 0 0]{alg_steps.pdf}}}
\caption{Steps of the algorithms needed to success on average with respect to $\lambda$. Algorithms denoted in a same way as in figure \ref{fig:alg_eval}.}
\label{fig:alg_steps}
\end{figure}
% % \begin{wrapfigure}{r}{0.5\textwidth}
% % \begin{figure}
% \begin{wrapfigure}{r}{0.5\textwidth}%
% % \wrapfigure{r}{.5\linewidth}{
% \centering{
% \begin{subfigure}{0.5\textwidth}
% {\scalebox{0.43}{\includegraphics[trim=40 0 0 0]{alg_1.pdf}}}
% \caption{Algorithm evaluation in a simulation, $T=10~ms$. Number of a simulation rounds for one parameter setting is 1000.
% A, B, C correspond to the Algorithm I, II and III respectively.
% D, E correspond to the Algorithm IV with optimized $C(\lambda T)$ by an exhaustive search in case D,
% in case E is then$C(\lambda T)$ is approximated by a polynomial.}
% \label{fig:alg_eval}
% \end{subfigure}
% % \end{figure}
% % \end{wrapfigure}
% % \begin{wrapfigure}{r}{0.5\textwidth}
% \begin{subfigure}{0.5\textwidth} %\begin{figure}[H]
% {\scalebox{0.43}{\includegraphics[trim=40 0 0 0]{alg_steps.pdf}}}
% \caption{Steps of the algorithms needed to success on average with respect to $\lambda$.
% Algorithms denoted in a same way as in figure \ref{fig:alg_eval}.}
% \label{fig:alg_steps}
% \end{subfigure}}%\end{figure}
% \vspace{-30pt}
% % \end{figure}
% \end{wrapfigure}
Figure \ref{fig:alg_eval} shows algorithms success rate with respect to~$\lambda$.
Figure \ref{fig:alg_steps} depicts number of steps needed
to succeed on average. From figures it is visible that Alg. I is more suitable for networks with low
workload since has $100\%$ success rate for $\lambda \in [0, 0.035]$. Outside this interval,
Alg. I is not working and another solution has to be applied. The problem comes as new connections are
created faster than algorithm progresses.
Alg. II is set to have fixed $\Delta=900$, in order to maximize the success rate. The disadvantage of this
algorithm is that it waits on a particular destination port and if it is taken by another connection
the algorithm fails.
Alg. IV has a high success rate for
noisy networks, but is very sensitive for the $\lambda$ estimation -- this decreases practical usability
if $\lambda$ changes quickly over time (non-homogeneous Poisson process). Closed form formula for $C(\lambda T)$
optimizing success rate and analytic derivation of the algorithm is still an open question.
% \begin{itemize}
% \item ToDo: fitting model to the real data from MU network
% \item ToDo: evaluating algorithms on real data from MU network
% \end{itemize}
\subsection{Real data modeling}
Data collected in our university network by NetFlow probes were studied to test the correspondence to the Poisson process
with respect to newly created connections/allocations on NAT. NAT was simulated in our program using
NetFlow data to provide network traffic to the NAT. NetFlow data capture 1 hour of a network communication in two different
day hours (03 -- low peak, 13 -- high peak).
NAT state, e.i., newly opened connections in window $T=10$, $50$, $100$~ms,
was sampled multiple times in a row, Poisson and Negative Binomial distributions were fitted using Maximum Likelihood Estimation (MLE)
and goodness-of-fit test with Pearson $\chi^2$ test was performed with confidence level $\alpha=0.05$.
\begin{table*}[t]
\centering
% set, T, sample size, PoFit, NB Fit, sample mean, variance of sample mean, sample variance
\begin{tabular}{| l | r | r | r | r | r | r | r |} %
\hline
Set & T [ms] & Samples & Poisson rej. [\%] & NBinom rej. [\%] & $\overline{m}_{E[X]}$ & $V[\overline{m}_{E[X]}]$ & $\overline{m}_{V[X]}$ \\ \hline \hline
\multirow{7}{*}{03} & \multirow{3}{*}{10} & 100 & 2.977 & 1.667 & 0.5121 & 0.0228 & 0.5321 \\ \cline{3-8}
& & 500 & 5.637 & 0.920 & 0.4851 & 0.0086 & 0.5146 \\ \cline{3-8}
& & 1000 & 10.294 & 1.579 & 0.4831 & 0.0068 & 0.5142 \\ \cline{2-8}
& \multirow{3}{*}{50} & 100 & 6.543 & 2.276 & 2.3707 & 0.2373 & 2.9969 \\ \cline{3-8}
& & 500 & 25.180 & 7.194 & 2.3871 & 0.1053 & 3.1179 \\ \cline{3-8}
& & 1000 & 52.174 & 23.188 & 2.3920 & 0.0696 & 3.1576 \\ \cline{2-8}
& \multirow{2}{*}{100} & 100 & 8.914 & 3.911 & 4.7263 & 0.7031 & 7.1496 \\ \cline{3-8}
& & 500 & 57.746 & 9.859 & 4.7668 & 0.2876 & 7.4667 \\ \hline
\multirow{2}{*}{13} & \multirow{2}{*}{10} & 100 & 5.191 & 1.824 & 2.6438 & 0.2695 & 3.0020 \\ \cline{3-8}
& & 500 & 20.368 & 2.829 & 2.6579 & 0.1025 & 3.1495 \\ \hline
\end{tabular}
\caption{Results from NAT simulation on the real data captured from the university network.
The meaning of columns from the left is following: data set name, sampling time, number of samples collected,
\% of cases rejecting hypothesis about Poisson distribution,
\% of cases rejecting hypothesis about Negative Binomial distribution,
sample mean of sample mean of port number samples (pns),
sample variance of sample mean of pns,
sample mean of variance of pns.}
\label{tab:1}
\end{table*}
Regarding data from set 13, which represents the high peak activity during the day, we conclude that if a number of samples
is lower, e.g., 100, the process of creating a new connection on NAT can
be modeled as a time homogeneous Poisson process successfully (hypothesis rejected in $5.191\%$ of cases).
The time window in which the network was monitored in this case is $T \cdot 100 = 1000~ms$.
When the number of samples is higher, e.g., 500, the time window of the studied network traffic is longer and
in a real world scenario the probability that $\lambda$ is constant in time is lower with the increasing length of the time
window. Due to this the homogeneous model is not strong enough, data is
overdispersed and this causes failure to fit Poisson distribution (hypothesis rejected in $20.368\%$ of cases). As suggested
in literature, Negative Binomial model fits better for overdispersed\footnote{The data overdispersion may be caused by dependent events on the network, e.g., by time-triggered updates downloading.}
data (hypothesis rejected in $2.829\%$ of cases).
%The open question is how to make use of this model in the NAT traversal.
For 100 samples the match with the model is satisfactory. Provided the algorithm is able to run completely (including learning $\lambda$), the proposed algorithms should work like in a simulation.
Regarding data from set 03, which represents the low peak activity we can conclude that a lower network workload causes more stable mean of sampled ports (MLE for Poisson distribution), thus the Poisson distribution hypothesis is not rejected even in longer time intervals for an estimated $\lambda$.
\section{Discussion and implementation}
For sake of the paper, we assumed that $\lambda$ is the same on both sides. However, new challenges
are likely to arise if we allow for different parameters on both sides, and we have not studied this yet.
Another open question is to find a model that optimizes given task, i.e., establishing a direct
connection. We proposed a start step, to model the same process that occurs on the network, i.e.,
Poisson process, but this method requires setting a coefficient function $C(\lambda T)$ for which
we do not have exact analytic explanation. This method is also very sensitive to parameter fluctuations
and thus not very practical for real world networks.
In our setting, we assume $T=10~ms$ in order to minimize the speed of the evolution of a Poisson process.
Such a short sending interval may cause some packets to get dropped on the path to the destination. Yet the
important issue for us is whether they reach the local NAT in order to create a mapping in the allocation table. As proposed
in \citep{Wang:2006:RSN:1156422.1156550}, it is possible to run the algorithm with $T=10~ms$ to create a mapping
and then re-run the algorithm with higher $T$, e.g., $T=100~ms$ in order to deliver packets to the destination
using created mapping and holes created in the previous round.
Success rate of the algorithms strongly relies on accurate measurements of the network
properties the client resides in. These properties can be measured prior to the connection establishment
from the point the client connects to the network, as is usually done in ICE, and to gather
more information, then to build better network model and to carry on in with a better performance afterwards.
% %We ignored the cases where there are already some allocations on NAT from older connections
% %on the port interval the NAT traversal algorithm is using.
%
% Another problematic issue comes when this algorithm is followed by multiple host pairs on
% the same network in the same time. They would artificially increase the network workload
% causing it would not follow Poisson distribution rendering our algorithms not usable in
% this case.
We developed an application written in Python that simulates NAT and mentioned algorithms, computes
statistics for port distributions on NAT and simulates algorithms for different parameters. It
may be helpful for another researchers interested in this area so we published it in public domain
hosted on GitHub: \url{https://github.com/ph4r05/NATSimTools}.
To validate our results, we implemented Android application using our baby-step giant-step algorithm
to traverse symmetric carrier-grade NAT that our mobile phone operator was using. The implementation
works according to expectations.
\section{Conclusion}
The aim of our work was to study the problem of NAT traversal from a statistical point of view,
and to provide algorithms with a high success rate for networks with a low workload that would be easy to
implement and would require little system resources.
We propose several algorithms solving this problem, yet with some open questions remaining.
Our goal was to find the algorithms that maximize success rate using only one source port strategy, possibly using this
in further research, perhaps in combination with previous multiple source port strategies.
We modeled the network workload as a Poisson process and we verified our approach on real data by hypothesis testing.
Our testing application was published in public domain. We also validated our results by traversing symmetric
carrier-grade NAT of our mobile phone provider.
We believe that by removing unnecessary intermediate nodes from
the communication path we restrict the attack and failure surfaces and increase the communication reliability.
Communication paths get less complex, transparency increases and this all support a more natural way of the
interconnecting nodes on P2P basis.
\par\smallskip
\noindent\textbf{Acknowledgement.}
This work was supported by the Czech Science Foundation, project GAP202/11/0422.
Thanks to Computer Security Incident Response Team of Masaryk University (CSIRT-MU) for providing anonymized NetFlow datasets.
\vspace{1cm}
% \renewcommand{\bibfont}{\small}
% \bibliographystyle{IEEEtran}
\bibliographystyle{abbrv}
\bibliography{paper}
% \end{multicols}
\end{document}