-
Notifications
You must be signed in to change notification settings - Fork 149
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add rfc for k8s multi cluster deployment (#5069)
* Add rfc for k8s multi cluster deployment Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> * Add the description about piped Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> * Add more expected behavior for rollback Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> * Add more expected behavior for stage log Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> * Add more behavior for registering app Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> * Fix docs typo Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com> --------- Signed-off-by: Yoshiki Fujikane <ffjlabo@gmail.com>
- Loading branch information
Showing
9 changed files
with
337 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,337 @@ | ||
- Start Date: 2024-07-25 | ||
- Target Version: 0.49.0 | ||
|
||
# Summary | ||
|
||
This RFC proposes a new feature for k8s app to deploy resources into multi-cluster. | ||
|
||
# Motivation | ||
|
||
# Usecase | ||
- case 1. When applying the same manifest to multiple clusters for redundant configuration | ||
- case 2. When applying manifest with some patches applied to multiple clusters for redundant configuration | ||
- case 3. Blue/Green Deployment across clusters | ||
|
||
# Detailed design | ||
|
||
## Overview | ||
|
||
We propose the feature to apply the manifests to multiple-clusters in one application. | ||
|
||
![image](assets/0014-pipeline-image.png) | ||
|
||
## How it works | ||
|
||
### Register Application with multiple platform providers | ||
|
||
On the application register part, we can choose multiple platform providers. | ||
At first, we add the first platform provider. | ||
If we want to use the feature for deploying multi-cluster, we can set more platform providers. This is optional. | ||
Only the platform providers specified here can be configured for multi-target. | ||
|
||
![image](assets/0014-choose-multiple-providers.png) | ||
|
||
Also, we can check the list of platform providers on the piped list page to verify the platform providers. | ||
|
||
![image](assets/0014-piped-list.png) | ||
|
||
### QickSync | ||
|
||
Piped asynchronously applies the resources to each environment based on the platform provider and resourceDir specified by the user. | ||
|
||
For example, consider deploying a microservice called `microservice-a` to the clusters called `cluster-hoge`, `cluster-fuga`. | ||
At first, we will prepare one application with one `app.pipecd.yaml` and some manifests like this. | ||
Set the item `multiTarget` in spec.quickSync of app.pipecd.yaml, and set the dir containing the manifests you want to deploy and the platform provider to which you want to deploy. | ||
This will deploy to `cluster-hoge` and `cluster-fuga` at the same time when quickSync is executed. | ||
|
||
``` | ||
microservice-a | ||
└── prd | ||
├── app.pipecd.yaml | ||
├── base | ||
│ ├── deployment.yaml | ||
│ ├── kustomization.yaml | ||
│ └── service.yaml | ||
├── cluster-hoge | ||
│ └── kustomization.yaml | ||
├── cluster-fuga | ||
│ └── kustomization.yaml | ||
└── kustomization.yaml | ||
``` | ||
|
||
```app.pipecd.yaml | ||
apiVersion: pipecd.dev/v1beta1 | ||
kind: KubernetesApp | ||
spec: | ||
name: multi-cluster-app | ||
labels: | ||
env: prd | ||
quickSync: | ||
multiTarget: | ||
- provider: | ||
name: cluster-hoge # platform provider name | ||
resourceDir: ./cluster-hoge # the resource dir | ||
- provider: | ||
name: cluster-fuga | ||
resourceDir: ./cluster-fuga | ||
``` | ||
|
||
**Rollback** | ||
|
||
Similarly, when rolling back, multiple environments are rolled back at the same time based on the information specified in `multiTarget`. | ||
If at least one of the rollback processes succeeds, we consider the rollback successful. | ||
This ensures that the rollback is executed for other environments even if one of the deployment environments is inaccessible. | ||
|
||
``` | ||
apiVersion: pipecd.dev/v1beta1 | ||
kind: KubernetesApp | ||
spec: | ||
name: multi-cluster-app | ||
labels: | ||
env: prd | ||
quickSync: | ||
multiTarget: | ||
- provider: | ||
name: cluster-hoge # platform provider name | ||
resourceDir: ./cluster-hoge # the resource dir | ||
- provider: | ||
name: cluster-fuga | ||
resourceDir: ./cluster-fuga | ||
``` | ||
|
||
|
||
### PipelineSync | ||
|
||
|
||
Piped asynchronously applies to each environment based on the platform provider and resourceDir specified by the user for each stage. | ||
|
||
For example, consider deploying a microservice called `microservice-a` to the clusters called `cluster-hoge`, `cluster-fuga`. | ||
At first, we will prepare one application with one `app.pipecd.yaml` and some manifests like this. | ||
Set the item `multiTarget` in spec.quickSync of app.pipecd.yaml, and set the dir containing the manifests you want to deploy and the platform provider to which you want to deploy. | ||
Also, set the item `multiTarget` in each stage config. | ||
This allows applications to be applied to multiple environments at the same time when one stage is executed. | ||
|
||
``` | ||
microservice-a | ||
└── prd | ||
├── app.pipecd.yaml | ||
├── base | ||
│ ├── deployment.yaml | ||
│ ├── kustomization.yaml | ||
│ └── service.yaml | ||
├── cluster-hoge | ||
│ └── kustomization.yaml | ||
├── cluster-fuga | ||
│ └── kustomization.yaml | ||
└── kustomization.yaml | ||
``` | ||
|
||
``` | ||
apiVersion: pipecd.dev/v1beta1 | ||
kind: KubernetesApp | ||
spec: | ||
name: multi-cluster-app | ||
labels: | ||
env: example | ||
team: product | ||
quickSync: | ||
prune: true | ||
multiTarget: | ||
- provider: | ||
name: cluster-hoge | ||
resourceDir: ./cluster-hoge | ||
- provider: | ||
name: cluster-fuga | ||
resourceDir: ./cluster-fuga | ||
pipeline: | ||
stages: | ||
- name: K8S_CANARY_ROLLOUT | ||
with: | ||
replicas: 10% | ||
multiTarget: | ||
- provider: | ||
name: cluster-hoge | ||
resourceDir: ./cluster-hoge | ||
- provider: | ||
name: cluster-fuga | ||
resourceDir: ./cluster-fuga | ||
... | ||
``` | ||
|
||
**Rollback** | ||
|
||
When rolling back, multiple environments are rolled back at the same time based on the information specified in `spec.quickSync.multiTarget`. | ||
If at least one of the rollback processes succeeds, we consider the rollback successful. | ||
This ensures that the rollback is executed for other environments even if one of the deployment environments is inaccessible. | ||
|
||
|
||
#### Stages to be supported | ||
|
||
We introduce the feature into the stages where changes are made to resources on the cluster. | ||
|
||
- K8S_PRIMARY_ROLLOUT | ||
- K8S_CANARY_ROLLOUT | ||
- K8S_CANARY_CLEAN | ||
- K8S_BASELINE_ROLLOUT | ||
- K8S_BASELINE_CLEAN | ||
- K8S_TRAFFIC_ROUTING | ||
|
||
### How to check the stage progress of each platform provider in the deployment | ||
|
||
Users can check stage logs for each platform provider. | ||
In the future, we will consider visualizing the deployment environment status for each platform provider. | ||
|
||
![image](assets/0014-stage-log.png) | ||
|
||
|
||
### Livestate View & Drift Detection | ||
|
||
|
||
Currently, a livestate store exists for each platform provider. | ||
Both Livestate View and drift detection use the values obtained from the livestate store based on the appID. | ||
Also, application : platform provider = 1:1 relationship is assumed. | ||
|
||
So we propose the improvement to obtain the all state from each platform provider using appID, like aggregation. | ||
This achieves a relationship of application : platform provider = 1 : N. | ||
|
||
**Livestate View** | ||
|
||
Show livestate of all platform providers deployed by app. | ||
|
||
**Drift Detection** | ||
|
||
Performs Drift Detection based on the livestate of all platform providers deployed by the app. | ||
|
||
### [option] Improve kubeconfig setup on piped | ||
|
||
Currently, we need to prepare the kubeconfig file manually. | ||
But it would be nice to prepare it automatically. | ||
|
||
It might realize it by using cloud vender feature, for example using Workload Identity on GKE, or IRSA on EKS. | ||
It means piped get kubeconfig when it starts by using them. | ||
|
||
# Alternatives | ||
|
||
## Idea: Execute Stages in parallel within a pipeline | ||
|
||
![image](assets/0014-pipeline-paralell-stage.png) | ||
|
||
### UX | ||
|
||
- When registering an application | ||
- Prepare manifests for each clusters and one app.pipecd.yaml & register on UI. | ||
- Dir structure | ||
|
||
``` | ||
- /prd | ||
- app.pipecd.yaml | ||
- /base | ||
- /cluster-hoge | ||
- /cluster-fuga | ||
``` | ||
|
||
- When deploying | ||
- Sync all clusters corresponding to prd. | ||
|
||
- When rolling back | ||
- Roll back in the all previous state. | ||
|
||
### Pros & Cons | ||
|
||
**Pros** | ||
|
||
- Only one app setting is required. | ||
- You can operate WaitApproval for all clusters in one place. | ||
- Flexisible stage pipeline. | ||
|
||
**Cons** | ||
|
||
- By realizing “parallel execution of stages”, the scheduler mechanism becomes complicated. | ||
|
||
# Idea: Deploy to multiple Platform Providers internally | ||
|
||
![image](assets/0014-pipeline-already-implemented.png) | ||
|
||
This is already implemented as PoC↓ | ||
- https://github.com/pipe-cd/pipecd/pull/3790 | ||
- https://github.com/pipe-cd/pipecd/pull/3854 | ||
|
||
## UX | ||
|
||
- When registering an application | ||
- Prepare manifests for each clusters and one app.pipecd.yaml & register on UI. | ||
- Dir structure | ||
|
||
``` | ||
- /prd | ||
- app.pipecd.yaml | ||
- /base | ||
- /cluster-hoge | ||
- /cluster-fuga | ||
``` | ||
|
||
- When deploying | ||
- Sync all clusters corresponding to prd. | ||
|
||
- When rolling back | ||
- Roll back in the all previous state. | ||
|
||
### Pros & Cons | ||
|
||
**Pros** | ||
|
||
- Only one app setting is required. | ||
- You can operate WaitApproval for all clusters in one place. | ||
|
||
**Cons** | ||
|
||
- Cannot support cases where you want to change the number of replicas for only some clusters. | ||
|
||
# Idea: Create a stage to sync apps | ||
|
||
![image](assets/0014-pipeline-sync-app-stage-01.png) | ||
|
||
![image](assets/0014-pipeline-sync-app-stage-02.png) | ||
|
||
### UX | ||
|
||
- When registering an application | ||
- Prepare one app.pipecd.yaml as a root application with sync app stage. | ||
- Prepare manifests and app.pipecd.yaml for each clusters and & register on UI. | ||
- Dir structure | ||
|
||
``` | ||
- /prd | ||
- app.pipecd.yaml | ||
- /base | ||
- /cluster-hoge | ||
- app.pipecd.yaml | ||
- /cluster-fuga | ||
- app.pipecd.yaml | ||
``` | ||
|
||
- When deploying | ||
- Sync all clusters corresponding to prd when triggering the root app. | ||
- If you want to sync clusters partially, sync them as the each application. | ||
|
||
- When rolling back | ||
- Roll back in the all previous state. | ||
- You can select the following behavior by setting the stage. | ||
- Rollback if any app fails | ||
- Rollback if all apps fail | ||
- If the deployments of the applications triggered by the sync app stage are successful, start rollback to the previous commit. | ||
- If the deployments of the applications triggered by the sync app stage are in progress, cancel it. | ||
|
||
### Pros & Cons | ||
|
||
**Pros** | ||
|
||
- It is possible to sync the whole or partially. | ||
- Deployment pipelines can be configured for each environment. | ||
|
||
**Cons** | ||
|
||
- It takes time to set the App config. | ||
- Need a mechanism to trigger application rollback. | ||
- You need to OK Wait Approval for each App. | ||
- Deployment Chain already exists as a similar function. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.