Skip to content

Commit 3a28768

Browse files
committed
Update website docs
1 parent 53c09f4 commit 3a28768

File tree

1 file changed

+89
-140
lines changed

1 file changed

+89
-140
lines changed

docs/README.md

Lines changed: 89 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
[![build](https://travis-ci.org/stefanprodan/flagger.svg?branch=master)](https://travis-ci.org/stefanprodan/flagger)
44
[![report](https://goreportcard.com/badge/github.com/stefanprodan/flagger)](https://goreportcard.com/report/github.com/stefanprodan/flagger)
5+
[![codecov](https://codecov.io/gh/stefanprodan/flagger/branch/master/graph/badge.svg)](https://codecov.io/gh/stefanprodan/flagger)
56
[![license](https://img.shields.io/github/license/stefanprodan/flagger.svg)](https://github.com/stefanprodan/flagger/blob/master/LICENSE)
67
[![release](https://img.shields.io/github/release/stefanprodan/flagger/all.svg)](https://github.com/stefanprodan/flagger/releases)
78

@@ -19,7 +20,7 @@ Deploy Flagger in the `istio-system` namespace using Helm:
1920

2021
```bash
2122
# add the Helm repository
22-
helm repo add flagger https://stefanprodan.github.io/flagger
23+
helm repo add flagger https://flagger.app
2324

2425
# install or upgrade
2526
helm upgrade -i flagger flagger/flagger \
@@ -32,10 +33,11 @@ Flagger is compatible with Kubernetes >1.10.0 and Istio >1.0.0.
3233

3334
### Usage
3435

35-
Flagger requires two Kubernetes [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/):
36-
one for the version you want to upgrade called _primary_ and one for the _canary_.
37-
Each deployment must have a corresponding ClusterIP [service](https://kubernetes.io/docs/concepts/services-networking/service/)
38-
that exposes a port named http or https. These services are used as destinations in a Istio [virtual service](https://istio.io/docs/reference/config/istio.networking.v1alpha3/#VirtualService).
36+
Flagger takes a Kubernetes deployment and creates a series of objects
37+
(Kubernetes [deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/),
38+
ClusterIP [services](https://kubernetes.io/docs/concepts/services-networking/service/) and
39+
Istio [virtual services](https://istio.io/docs/reference/config/istio.networking.v1alpha3/#VirtualService))
40+
to drive the canary analysis and promotion.
3941

4042
![flagger-overview](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-overview.png)
4143

@@ -44,102 +46,69 @@ Gated canary promotion stages:
4446
* scan for canary deployments
4547
* check Istio virtual service routes are mapped to primary and canary ClusterIP services
4648
* check primary and canary deployments status
47-
* halt rollout if a rolling update is underway
48-
* halt rollout if pods are unhealthy
49+
* halt advancement if a rolling update is underway
50+
* halt advancement if pods are unhealthy
4951
* increase canary traffic weight percentage from 0% to 5% (step weight)
5052
* check canary HTTP request success rate and latency
51-
* halt rollout if any metric is under the specified threshold
53+
* halt advancement if any metric is under the specified threshold
5254
* increment the failed checks counter
5355
* check if the number of failed checks reached the threshold
5456
* route all traffic to primary
5557
* scale to zero the canary deployment and mark it as failed
5658
* wait for the canary deployment to be updated (revision bump) and start over
5759
* increase canary traffic weight by 5% (step weight) till it reaches 50% (max weight)
58-
* halt rollout while canary request success rate is under the threshold
59-
* halt rollout while canary request duration P99 is over the threshold
60-
* halt rollout if the primary or canary deployment becomes unhealthy
61-
* halt rollout while canary deployment is being scaled up/down by HPA
60+
* halt advancement while canary request success rate is under the threshold
61+
* halt advancement while canary request duration P99 is over the threshold
62+
* halt advancement if the primary or canary deployment becomes unhealthy
63+
* halt advancement while canary deployment is being scaled up/down by HPA
6264
* promote canary to primary
6365
* copy canary deployment spec template over primary
6466
* wait for primary rolling update to finish
65-
* halt rollout if pods are unhealthy
67+
* halt advancement if pods are unhealthy
6668
* route all traffic to primary
6769
* scale to zero the canary deployment
6870
* mark rollout as finished
6971
* wait for the canary deployment to be updated (revision bump) and start over
7072

7173
You can change the canary analysis _max weight_ and the _step weight_ percentage in the Flagger's custom resource.
7274

73-
Assuming the primary deployment is named _podinfo_ and the canary one _podinfo-canary_, Flagger will require
74-
a virtual service configured with weight-based routing:
75+
For a deployment named _podinfo_, a canary promotion can be defined using Flagger's custom resource:
7576

7677
```yaml
77-
apiVersion: networking.istio.io/v1alpha3
78-
kind: VirtualService
79-
metadata:
80-
name: podinfo
81-
spec:
82-
hosts:
83-
- podinfo
84-
http:
85-
- route:
86-
- destination:
87-
host: podinfo
88-
port:
89-
number: 9898
90-
weight: 100
91-
- destination:
92-
host: podinfo-canary
93-
port:
94-
number: 9898
95-
weight: 0
96-
```
97-
98-
Primary and canary services should expose a port named http:
99-
100-
```yaml
101-
apiVersion: v1
102-
kind: Service
103-
metadata:
104-
name: podinfo-canary
105-
spec:
106-
type: ClusterIP
107-
selector:
108-
app: podinfo-canary
109-
ports:
110-
- name: http
111-
port: 9898
112-
targetPort: 9898
113-
```
114-
115-
Based on the two deployments, services and virtual service, a canary promotion can be defined using Flagger's custom resource:
116-
117-
```yaml
118-
apiVersion: flagger.app/v1beta1
78+
apiVersion: flagger.app/v1alpha1
11979
kind: Canary
12080
metadata:
12181
name: podinfo
12282
namespace: test
12383
spec:
124-
targetKind: Deployment
125-
virtualService:
84+
# deployment reference
85+
targetRef:
86+
apiVersion: apps/v1
87+
kind: Deployment
12688
name: podinfo
127-
primary:
89+
# hpa reference (optional)
90+
autoscalerRef:
91+
apiVersion: autoscaling/v2beta1
92+
kind: HorizontalPodAutoscaler
12893
name: podinfo
129-
host: podinfo
130-
canary:
131-
name: podinfo-canary
132-
host: podinfo-canary
94+
service:
95+
# container port
96+
port: 9898
97+
# Istio gateways (optional)
98+
gateways:
99+
- public-gateway.istio-system.svc.cluster.local
100+
# Istio virtual service host names (optional)
101+
hosts:
102+
- app.istio.weavedx.com
133103
canaryAnalysis:
134-
# max number of failed checks
135-
# before rolling back the canary
136-
threshold: 10
104+
# max number of failed metric checks before rollback
105+
threshold: 5
137106
# max traffic percentage routed to canary
138107
# percentage (0-100)
139108
maxWeight: 50
140109
# canary increment step
141110
# percentage (0-100)
142-
stepWeight: 5
111+
stepWeight: 10
143112
metrics:
144113
- name: istio_requests_total
145114
# minimum req success rate (non 5xx responses)
@@ -150,14 +119,14 @@ spec:
150119
# maximum req duration P99
151120
# milliseconds
152121
threshold: 500
153-
interval: 1m
122+
interval: 30s
154123
```
155124
156125
The canary analysis is using the following promql queries:
157126
158127
_HTTP requests success rate percentage_
159128
160-
```promql
129+
```sql
161130
sum(
162131
rate(
163132
istio_requests_total{
@@ -182,7 +151,7 @@ sum(
182151

183152
_HTTP requests milliseconds duration P99_
184153

185-
```promql
154+
```sql
186155
histogram_quantile(0.99,
187156
sum(
188157
irate(
@@ -198,8 +167,6 @@ histogram_quantile(0.99,
198167

199168
### Automated canary analysis, promotions and rollbacks
200169

201-
![flagger-canary](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-canary-hpa.png)
202-
203170
Create a test namespace with Istio sidecar injection enabled:
204171

205172
```bash
@@ -208,66 +175,72 @@ export REPO=https://raw.githubusercontent.com/stefanprodan/flagger/master
208175
kubectl apply -f ${REPO}/artifacts/namespaces/test.yaml
209176
```
210177

211-
Create the primary deployment, service and hpa:
178+
Create a deployment and a horizontal pod autoscaler:
212179

213180
```bash
214-
kubectl apply -f ${REPO}/artifacts/workloads/primary-deployment.yaml
215-
kubectl apply -f ${REPO}/artifacts/workloads/primary-service.yaml
216-
kubectl apply -f ${REPO}/artifacts/workloads/primary-hpa.yaml
181+
kubectl apply -f ${REPO}/artifacts/canaries/deployment.yaml
182+
kubectl apply -f ${REPO}/artifacts/canaries/hpa.yaml
217183
```
218184

219-
Create the canary deployment, service and hpa:
185+
Create a canary promotion custom resource (replace the Istio gateway and the internet domain with your own):
220186

221187
```bash
222-
kubectl apply -f ${REPO}/artifacts/workloads/canary-deployment.yaml
223-
kubectl apply -f ${REPO}/artifacts/workloads/canary-service.yaml
224-
kubectl apply -f ${REPO}/artifacts/workloads/canary-hpa.yaml
188+
kubectl apply -f ${REPO}/artifacts/canaries/canary.yaml
225189
```
226190

227-
Create a virtual service (replace the Istio gateway and the internet domain with your own):
191+
After a couple of seconds Flagger will create the canary objects:
228192

229193
```bash
230-
kubectl apply -f ${REPO}/artifacts/workloads/virtual-service.yaml
194+
# applied
195+
deployment.apps/podinfo
196+
horizontalpodautoscaler.autoscaling/podinfo
197+
canary.flagger.app/podinfo
198+
# generated
199+
deployment.apps/podinfo-primary
200+
horizontalpodautoscaler.autoscaling/podinfo-primary
201+
service/podinfo
202+
service/podinfo-canary
203+
service/podinfo-primary
204+
virtualservice.networking.istio.io/podinfo
231205
```
232206

233-
Create a canary promotion custom resource:
207+
![flagger-canary-steps](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-canary-steps.png)
208+
209+
Trigger a canary deployment by updating the container image:
234210

235211
```bash
236-
kubectl apply -f ${REPO}/artifacts/rollouts/podinfo.yaml
212+
kubectl -n test set image deployment/podinfo \
213+
podinfod=quay.io/stefanprodan/podinfo:1.2.1
237214
```
238215

239-
Canary promotion output:
216+
Flagger detects that the deployment revision changed and starts a new rollout:
240217

241218
```
242219
kubectl -n test describe canary/podinfo
243220
244221
Status:
245-
Canary Revision: 16271121
246-
Failed Checks: 6
222+
Canary Revision: 19871136
223+
Failed Checks: 0
247224
State: finished
248225
Events:
249226
Type Reason Age From Message
250227
---- ------ ---- ---- -------
251-
Normal Synced 3m flagger Starting canary deployment for podinfo.test
228+
Normal Synced 3m flagger New revision detected podinfo.test
229+
Normal Synced 3m flagger Scaling up podinfo.test
230+
Warning Synced 3m flagger Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available
252231
Normal Synced 3m flagger Advance podinfo.test canary weight 5
253232
Normal Synced 3m flagger Advance podinfo.test canary weight 10
254233
Normal Synced 3m flagger Advance podinfo.test canary weight 15
255-
Warning Synced 3m flagger Halt podinfo.test advancement request duration 2.525s > 500ms
256-
Warning Synced 3m flagger Halt podinfo.test advancement request duration 1.567s > 500ms
257-
Warning Synced 3m flagger Halt podinfo.test advancement request duration 823ms > 500ms
258234
Normal Synced 2m flagger Advance podinfo.test canary weight 20
259235
Normal Synced 2m flagger Advance podinfo.test canary weight 25
260236
Normal Synced 1m flagger Advance podinfo.test canary weight 30
261-
Warning Synced 1m flagger Halt podinfo.test advancement success rate 82.33% < 99%
262-
Warning Synced 1m flagger Halt podinfo.test advancement success rate 87.22% < 99%
263-
Warning Synced 1m flagger Halt podinfo.test advancement success rate 94.74% < 99%
264237
Normal Synced 1m flagger Advance podinfo.test canary weight 35
265238
Normal Synced 55s flagger Advance podinfo.test canary weight 40
266239
Normal Synced 45s flagger Advance podinfo.test canary weight 45
267240
Normal Synced 35s flagger Advance podinfo.test canary weight 50
268-
Normal Synced 25s flagger Copying podinfo-canary.test template spec to podinfo.test
269-
Warning Synced 15s flagger Waiting for podinfo.test rollout to finish: 1 of 2 updated replicas are available
270-
Normal Synced 5s flagger Promotion completed! Scaling down podinfo-canary.test
241+
Normal Synced 25s flagger Copying podinfo.test template spec to podinfo-primary.test
242+
Warning Synced 15s flagger Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available
243+
Normal Synced 5s flagger Promotion completed! Scaling down podinfo.test
271244
```
272245

273246
During the canary analysis you can generate HTTP 500 errors and high latency to test if Flagger pauses the rollout.
@@ -313,45 +286,8 @@ Events:
313286
Normal Synced 2m flagger Halt podinfo.test advancement success rate 55.06% < 99%
314287
Normal Synced 2m flagger Halt podinfo.test advancement success rate 47.00% < 99%
315288
Normal Synced 2m flagger (combined from similar events): Halt podinfo.test advancement success rate 38.08% < 99%
316-
Warning Synced 1m flagger Rolling back podinfo-canary.test failed checks threshold reached 10
317-
Warning Synced 1m flagger Canary failed! Scaling down podinfo-canary.test
318-
```
319-
320-
Trigger a new canary deployment by updating the canary image:
321-
322-
```bash
323-
kubectl -n test set image deployment/podinfo-canary \
324-
podinfod=quay.io/stefanprodan/podinfo:1.2.1
325-
```
326-
327-
Steer detects that the canary revision changed and starts a new rollout:
328-
329-
```
330-
kubectl -n test describe canary/podinfo
331-
332-
Status:
333-
Canary Revision: 19871136
334-
Failed Checks: 0
335-
State: finished
336-
Events:
337-
Type Reason Age From Message
338-
---- ------ ---- ---- -------
339-
Normal Synced 3m flagger New revision detected podinfo-canary.test old 17211012 new 17246876
340-
Normal Synced 3m flagger Scaling up podinfo.test
341-
Warning Synced 3m flagger Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available
342-
Normal Synced 3m flagger Advance podinfo.test canary weight 5
343-
Normal Synced 3m flagger Advance podinfo.test canary weight 10
344-
Normal Synced 3m flagger Advance podinfo.test canary weight 15
345-
Normal Synced 2m flagger Advance podinfo.test canary weight 20
346-
Normal Synced 2m flagger Advance podinfo.test canary weight 25
347-
Normal Synced 1m flagger Advance podinfo.test canary weight 30
348-
Normal Synced 1m flagger Advance podinfo.test canary weight 35
349-
Normal Synced 55s flagger Advance podinfo.test canary weight 40
350-
Normal Synced 45s flagger Advance podinfo.test canary weight 45
351-
Normal Synced 35s flagger Advance podinfo.test canary weight 50
352-
Normal Synced 25s flagger Copying podinfo-canary.test template spec to podinfo.test
353-
Warning Synced 15s flagger Waiting for podinfo.test rollout to finish: 1 of 2 updated replicas are available
354-
Normal Synced 5s flagger Promotion completed! Scaling down podinfo-canary.test
289+
Warning Synced 1m flagger Rolling back podinfo.test failed checks threshold reached 10
290+
Warning Synced 1m flagger Canary failed! Scaling down podinfo.test
355291
```
356292

357293
### Monitoring
@@ -388,9 +324,22 @@ Advance podinfo.test canary weight 40
388324
Halt podinfo.test advancement request duration 1.515s > 500ms
389325
Advance podinfo.test canary weight 45
390326
Advance podinfo.test canary weight 50
391-
Copying podinfo-canary.test template spec to podinfo-primary.test
392-
Scaling down podinfo-canary.test
393-
Promotion completed! podinfo-canary.test revision 81289
327+
Copying podinfo.test template spec to podinfo-primary.test
328+
Halt podinfo-primary.test advancement waiting for rollout to finish: 1 old replicas are pending termination
329+
Scaling down podinfo.test
330+
Promotion completed! podinfo.test
331+
```
332+
333+
Flagger exposes Prometheus metrics that can be used to determine the canary analysis status and the destination weight values:
334+
335+
```bash
336+
# Canary status
337+
# 0 - running, 1 - successful, 2 - failed
338+
flagger_canary_status{name="podinfo" namespace="test"} 1
339+
340+
# Canary traffic weight
341+
flagger_canary_weight{workload="podinfo-primary" namespace="test"} 95
342+
flagger_canary_weight{workload="podinfo" namespace="test"} 5
394343
```
395344

396345
### Roadmap

0 commit comments

Comments
 (0)