-
Notifications
You must be signed in to change notification settings - Fork 9
/
workshop.slide
633 lines (402 loc) · 13 KB
/
workshop.slide
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
Basics with rkt, the container engine by CoreOS
27th June 2016
Sergiusz Urbaniak
rkt Engineer, CoreOS
sur@coreos.com
@_surbaniak
* whoami
Software Engineer at CoreOS
rkt developer
Yes ... I do use the Linux Desktop :-)
* Overview
.image rkt8s.png _ 600
- Learn about rkt
- Use rkt
- Learn about Kubernetes
- Use Kubernetes with rkt
Requirements:
- Vagrant
- Virtualbox
* Setup
git clone https://github.com/coreos/rktnetes-workshop
cd vagrant
vagrant up
vagrans ssh
* rkt - What is it?
rkt (pronounced _rock-it_) is a CLI for running app containers on Linux. rkt is designed to be secure, composable, and standards-based.
- It runs on Linux only (currently).
- It runs on Intel `amd64`.
- Port to `arm64` is underway.
* rkt - a brief history
*December*2014* - _v0.1.0_
- Prototype
- Drive conversation (security, standards) and competition (healthy OSS) in container ecosystem
*February*2016* - _v1.0.0_
- Used in Production
- API stability guarantees
*June*2016* - _v1.9.0_
- Packaged in Debian, Fedora, Arch, NixOS
* rkt - Philosophy, Idioms
*UX:*"secure-by-default"*
- Verify image signatures by default
- Verify image integrity by default
*Architecture:*Unix*philosophy*
- Well-defined operations
- No central priviledged long-running components
- Separate privileges for different operations, i.e. "fetch"
- Verbose at times
* rkt - Security
- User namespaces (Container `euid` `!=` `host_euid`)
- SELinux contexts (Isolate individual pods)
- VM containment (LKVM)
- TPM measurements (Tamper-proof audit log of what's running)
* rkt - Security (soon)
- Finer-grained Linux capabilities
- Seccomp defaults
- Better SELinux support
- Cgroup2 + cgroup namespaces support
- Qemu (hypervisor) support
- Unpriviledged containers
* rkt - Composability
*External*
- Integrate with existing init system(s)
- Work well with higher level cluster orchestration tools (i.e. Kubernetes)
- Simple process model: rkt command _is_ the container
- Any context applied to rkt (cgroups, etcd.) applies transitively to the pods inside rkt
- Optional gRPC API server
*Internal*
- stage based architecture
- Swappable execution engines running containers
* rkt vs. Docker
.image rkt-vs-docker-process-model.png 500 _
* rkt vs. Docker
.image rkt-vs-docker-fetch.png 500 _
* rkt - Speaking of images
.image image-standards.png
* rkt - Image support
- ACI - yes
- Docker - yes
Caveat: Docker images do not suport signing, start with `--insecure-options=image`
- OCI - WIP
*Workshop*time*
1. Start nginx
rkt run --insecure-options=image docker://nginx
2. Start an interactive container
rkt run --insecure-options=image --interactive docker://progrium/busybox
quit by hitting `Ctrl-]` three times
* rkt vs. ?
.link https://github.com/coreos/rkt/blob/master/Documentation/rkt-vs-other-projects.md
* rkt stages
Execution with rkt is divided into several distinct stages.
*stage0*
The `rkt` binary itself.
- Fetch images
- Generate pod UUIDs, manifest
- Create filesystem for the pod
- Set up further stages
- Unpack stage1 into pod file system
- Unpack app images into stage2 directories
* rkt stages - stage1
*stage1*:
An image providing:
- Initializes container isolation
- Starts and initializes container supervision
- Reads pods and image manifests
- Starts the actual applications
This is swappable, comes in different "flavors":
- _fly_: does nothing really ... just a chroot environment
- _coreos_: uses Linux cgroups, namespaces, leverages systemd
- _kvm_: Uses LKVM supervisor to boot a real virtualized kernel
* rkt stages - coreos stage1
Provides container isolation
stage1-coreos.aci
host OS
└─ rkt
└─ systemd-nspawn
└─ systemd
└─ chroot
└─ user-app1
- *systemd-nspawn*: Implements the initialization of cgroups, namespaces
- *systemd*: Implements supervision of containers
* rkt stages - fly stage1
stage1-fly.aci
host OS
└─ rkt
└─ chroot
└─ user-app1
- Just a chroot'ed application
but
- Secure image retrieval including signature validation
- Lightweight, isolated package manager
* rkt stages - ??? stage1
Crazy ideas:
- _xhyve_: OSX virtualization allowing rkt to run natively on a Mac
- _runit_: Don't like systemd? Bring your own isolator/supervisor
PRs/Contributions welcome!
* rkt stages - fly stage2
.image you.png _ 400
* rkt - build your own image
*Workshop*time*
.link https://github.com/s-urbaniak/inspector
.link https://coreos.com/rkt/docs/latest/signing-and-verification-guide.html
- Build a small Go application
- Build an ACI image
- Sign the image using `gpg`
- Make the image discoverable via github pages
* Let's step back
.image os-procs.png _ 200
In a classical "OS" setup we have:
- A supervisor, aka "init daemon", aka PID1
- Not only one process, but many processes
- Processes work together, either via localhost net, IPC
- Communicate with outside world
* rkt - Pods
.image pod-apps.png _ 300
- Grouping of applications executing in a shared context (network, namespaces, volumes)
- Shared fate
- The _only_ execution primitive: single applications are modelled as singleton pods
* rkt - Sample Pod: micro-service talking to Redis
*Workshop*time*
.image redis-service.png _ 230
sudo rkt run --insecure-options=image docker://redis \
s-urbaniak.github.io/rktnetes-workshop/redis-service:0.0.1
rkt list && rkt cat-manifest deadbeef
curl <pod-ip>:8080
.link https://github.com/s-urbaniak/redis-service
* Pods - Patterns, patterns everywhere
Container/App Design Patterns
- Kubernetes enables new design patterns
- Similar to OO patterns
- Key difference: technologically agnostic
.link http://blog.kubernetes.io/2016/06/container-design-patterns.html
.link https://www.usenix.org/system/files/conference/hotcloud16/hotcloud16_burns.pdf
* Pods - Sidecar pattern
.image pattern-sidecar.png _ 400
- Auxiliary app
- Extend, enhance main app
Pros:
- Separate packaging units
- Each app contained in a separate failure boundary
- Potentially different technologies/languages
* Pods - Ambassador pattern
.image pattern-ambassador.png _ 400
- Proxy communication
- Separation of concerns
- Main app has simplified view
* Pods - Adapter pattern
.image pattern-adapter.png _ 400
- Use an interface of an existing app as another interface
- Very useful for legacy apps, translating protocols
* Pods - Leader election pattern
.image pattern-leader.png _ 400
- Separate the leader logic from the election logic
- Swappable algorithms/technologies/environments
Ready-to-use generic leader elector:
.link http://blog.kubernetes.io/2016/01/simple-leader-election-with-Kubernetes.html
* Pods - Work queue pattern
.image pattern-work-queue.png _ 400
- Separate app logic from queue enqueing/dequeing
* Pods - Scatter gather pattern
.image pattern-scatter-gather.png _ 400
- Main app sends a simple request
- Auxiliary app implements complex scatter/gather logic
- Fan-Out/Fan-In requests separate from main app
* rkt - Networking
The CNI (Container Network Interface)
.image pod-net.png _ 300
- Abstraction layer for network configuration
- Single API to multiple, extensible networks
- Narrow, simple API
- Plugins for third-party implementations
* rkt - Networking - Host Mode
.image host-mode.png _ 300
rkt run --net=host ...
- Inherit the network namespace of the process that is invoking rkt.
- Pod apps are able to access everything associated with the host’s network interfaces.
*Workshop*time*
1. Start nginx using `--net=host`
* rkt - Networking - Default Mode (CNI ptp)
.image ptp.png _ 300
rkt run --net ...
rkt run --net=default ...
.link https://github.com/containernetworking/cni/blob/master/Documentation/ptp.md
- Creates a virtual ethernet pair
- One placed in the pod
- Other one placed on the host
* rkt - Networking - CNI brigde
.image bridge.png _ 300
.link https://github.com/containernetworking/cni/blob/master/Documentation/bridge.md
- Creates a virtual ethernet pair
- One placed in the pod
- Other one placed on the host
- Host veth pluggind into a linux bridge
* rkt - Networking - CNI macvlan
.image macvlan.png _ 300
.link https://github.com/containernetworking/cni/blob/master/Documentation/macvlan.md
- Functions like a switch
- Pods get different MAC addresses
- Pods share the same physical device
* rkt - Networking - CNI ipvlan
.image ipvlan.png _ 300
.link https://github.com/containernetworking/cni/blob/master/Documentation/ipvlan.md
- Functions like a switch
- Pods share the same MAC address
- Pods get different IPs
- Pods share the same physical device
* rkt - Networking
*Workshop*time*
Configure a ptp network, start two `busybox` pods pinging each other.
* rkt - Networking - SDN (software defined networking)
.image pod-net-canal.png 300 _
- Communicate with pods across different _hosts_
- Each pod across all hosts gets its own IP
- Virtual overlay network
* Kubernetes - Overview
.image flower.png
.link https://github.com/kubernetes/kubernetes
- Open Source project initiated by Google
- Cluster-level container orchestration
Handles:
- Scheduling/Upgrades
- Failure recovery
- Scaling
* k8s - Components - API Server
- Validates, configures, persists for all Kubernetes API objects
- Provides REST based operations via JSON
- Uses etcd is its database
* k8s - Components - Controller Manager
.image control-loop.png _ 300
- Embeds the core control loops
- Is _state-less_
- Is decoupled from etcd via the API-server
Just a daemon talking to etcd
* k8s - Components - Scheduler
Schedules pods on nodes
- Policy-rich
- Topology-aware
- Workload-specific
- Considers resource requirements, QoS, HW/SW/policy constraints, ...
Again ... just a daemon talking to etcd
* k8s - Components - Kubelet
- Primary node agent
- Starts/Stops/Supervises pods on its node
Sigh ... just a daemon talking to etcd and to *rkt* !!!
Instructs rkt to:
- Fetch pods
- Start, and stop pods
- Exec into pods
- Tail pod logs
PS: We are working very hard to make *rkt* a first class runtime in Kubernetes.
* k8s - Components - Kube Proxy
Primary network proxy on each node.
- configures `iptables` to reflect services
- Forwards TCP/UDP streams across a set of backends
* k8s - The big picture
.image k8s-overview.png 350 _
1. Watches created pods, assigns them to nodes
2. Runs controllers (node ctl, replication ctl, endpoint ctl, service account ctl, ...)
3. Watches for pods assigned to its node, runs *rkt*!
4. Manipulates network rules (iptables) for services on a node, does connection forwarding.
* k8s - Let's do it!
*Workshop*time*
1. Start Kubernetes
2. Create a service
3. Launch nginx
4. Launch busybox
5. Exec into busybox
6. Launch the Kubernetes dashboard
7. Open it from the host machine
See the next slides for templates.
* rktnetes - nginx Service
*Workshop*time*
kind: Service
apiVersion: v1
metadata:
labels:
app: nginx
name: nginx
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
selector:
app: nginx
kubectl create -f nginx-svc.yaml
* rktnetes - nginx Replication Controller
kind: ReplicationController
apiVersion: v1
metadata:
labels:
app: nginx
name: nginx
spec:
replicas: 1
selector:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
protocol: TCP
kubectl create -f nginx-rc.yaml
* rktnetes - busybox pod
*Workshop*time*
apiVersion: v1
kind: Pod
metadata:
name: busybox
spec:
containers:
- name: busybox
image: progrium/busybox
args:
- sleep
- "1000000"
kubectl create -f busybox.yaml
Exec into the pod:
kubectl exec -ti busybox /bin/sh
* rktnetes - Dashboard
Deploy the Kubernetes Dashboard
*Workshop*time*
kubectl create -f https://rawgit.com/kubernetes/dashboard/master/src/deploy/kubernetes-dashboard.yaml
ip address show dev eth1
kubect get po --namespace=kube-system kubernetes-dashboard -o yaml
1. Get the assigned NodePort
2. Open http://<vm-ip>:NodePort on the host
* rktnetes - Ingress
Deploy an Ingress Controller based on traefik
*Workshop*time*
kubectl create -f /vagrant/traefik.yaml
Add the following entry to `/etc/hosts`, replace `vm-ip` with the VM IP
<vm-ip> traefik.rktnetes
.link http://traefik.rktnetes http://traefik.rktnetes
* rktnetes - Ingress
.image traefik-ingress.png _ 1000
* rktnetes - Expose Dashboard via Ingress
*Workshop*time*
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: kubernetes-dashboard-ingress
namespace: kube-system
spec:
rules:
- host: dashboard.rktnetes
http:
paths:
- path: /
backend:
serviceName: kubernetes-dashboard
servicePort: 80
Add the following entry to `/etc/hosts`, replace `vm-ip` with the VM IP
<vm-ip> dashboard.rktnetes
.link http://dashboard.rktnetes http://dashboard.rktnetes
* rktnetes - Expose Dashboard via Ingress
.image dashboard-ingress.png _ 1000