-
Notifications
You must be signed in to change notification settings - Fork 3
approval controller, metric collector controllers #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
7764719
fc87b21
f3ea6d5
3c8db43
e85a7a0
23ac827
018cacb
b8448b3
097e14a
164a25d
cac604c
69a8ed4
76de5f3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # Makefile for ApprovalRequest Controller | ||
|
|
||
| # Image settings | ||
| IMAGE_NAME ?= approval-request-controller | ||
| IMAGE_TAG ?= latest | ||
| REGISTRY ?= | ||
|
|
||
| # Build settings | ||
| GOOS ?= $(shell go env GOOS) | ||
| GOARCH ?= $(shell go env GOARCH) | ||
|
|
||
| # Tools | ||
| CONTROLLER_GEN_VERSION ?= v0.16.0 | ||
| CONTROLLER_GEN = go run sigs.k8s.io/controller-tools/cmd/controller-gen@$(CONTROLLER_GEN_VERSION) | ||
|
|
||
| .PHONY: help | ||
| help: ## Display this help | ||
| @awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-15s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST) | ||
|
|
||
| ##@ Code Generation | ||
|
|
||
| .PHONY: manifests | ||
| manifests: ## Generate CRD manifests | ||
| $(CONTROLLER_GEN) crd paths="./apis/..." output:crd:artifacts:config=config/crd/bases | ||
|
|
||
| .PHONY: generate | ||
| generate: ## Generate DeepCopy code | ||
| $(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./apis/..." | ||
|
|
||
| ##@ Build | ||
|
|
||
| .PHONY: docker-build | ||
| docker-build: ## Build docker image | ||
| docker buildx build \ | ||
| --file docker/approval-request-controller.Dockerfile \ | ||
| --output=type=docker \ | ||
| --platform=linux/$(GOARCH) \ | ||
| --build-arg GOARCH=$(GOARCH) \ | ||
| --tag $(IMAGE_NAME):$(IMAGE_TAG) \ | ||
| --build-context kubefleet=.. \ | ||
| .. | ||
|
|
||
| .PHONY: docker-push | ||
| docker-push: ## Push docker image | ||
| docker push $(REGISTRY)$(IMAGE_NAME):$(IMAGE_TAG) | ||
|
|
||
| ##@ Development | ||
|
|
||
| .PHONY: run | ||
| run: ## Run controller locally | ||
| cd .. && go run ./approval-request-controller/cmd/approvalrequestcontroller/main.go | ||
|
|
||
| ##@ Deployment | ||
|
|
||
| .PHONY: install | ||
| install: ## Install helm chart | ||
| helm install approval-request-controller ./charts/approval-request-controller \ | ||
| --namespace fleet-system \ | ||
| --create-namespace \ | ||
| --set image.repository=$(IMAGE_NAME) \ | ||
| --set image.tag=$(IMAGE_TAG) | ||
|
|
||
| .PHONY: upgrade | ||
| upgrade: ## Upgrade helm chart | ||
| helm upgrade approval-request-controller ./charts/approval-request-controller \ | ||
| --namespace fleet-system \ | ||
| --set image.repository=$(IMAGE_NAME) \ | ||
| --set image.tag=$(IMAGE_TAG) | ||
|
|
||
| .PHONY: uninstall | ||
| uninstall: ## Uninstall helm chart | ||
| helm uninstall approval-request-controller --namespace fleet-system | ||
|
|
||
| ##@ Kind | ||
|
|
||
| .PHONY: kind-load | ||
| kind-load: docker-build ## Build and load image into kind cluster | ||
| kind load docker-image $(IMAGE_NAME):$(IMAGE_TAG) --name hub |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,121 @@ | ||
| # ApprovalRequest Controller | ||
|
|
||
| The ApprovalRequest Controller is a standalone controller that runs on the **hub cluster** to automate approval decisions for staged updates based on workload health metrics. | ||
|
|
||
| ## Overview | ||
|
|
||
| This controller is designed to be a standalone component that can run independently from the main kubefleet repository. It: | ||
| - Uses kubefleet v0.1.2 as an external dependency | ||
| - Includes its own APIs for MetricCollectorReport and WorkloadTracker | ||
| - Watches `ApprovalRequest` and `ClusterApprovalRequest` resources (from kubefleet) | ||
| - Creates `MetricCollector` resources on member clusters via ClusterResourcePlacement | ||
| - Monitors workload health via `MetricCollectorReport` objects | ||
| - Automatically approves requests when all tracked workloads are healthy | ||
| - Runs every 15 seconds to check health status | ||
|
|
||
| ## Architecture | ||
|
|
||
| The controller is designed to run on the hub cluster and: | ||
| 1. Deploys MetricCollector instances to member clusters using CRP | ||
| 2. Collects health metrics from MetricCollectorReports | ||
| 3. Compares metrics against WorkloadTracker specifications | ||
| 4. Approves ApprovalRequests when all workloads are healthy | ||
|
|
||
| ## Installation | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| The following CRDs must be installed on the hub cluster (installed by kubefleet hub-agent): | ||
| - `approvalrequests.placement.kubernetes-fleet.io` | ||
| - `clusterapprovalrequests.placement.kubernetes-fleet.io` | ||
| - `clusterresourceplacements.placement.kubernetes-fleet.io` | ||
| - `clusterresourceoverrides.placement.kubernetes-fleet.io` | ||
| - `clusterstagedupdateruns.placement.kubernetes-fleet.io` | ||
| - `stagedupdateruns.placement.kubernetes-fleet.io` | ||
|
|
||
| The following CRDs are installed by this chart: | ||
| - `metriccollectors.metric.kubernetes-fleet.io` | ||
| - `metriccollectorreports.metric.kubernetes-fleet.io` | ||
| - `workloadtrackers.metric.kubernetes-fleet.io` | ||
|
|
||
| ### Install via Helm | ||
|
|
||
| ```bash | ||
| # Build the image | ||
| make docker-build IMAGE_NAME=approval-request-controller IMAGE_TAG=latest | ||
|
|
||
| # Load into kind (if using kind) | ||
| kind load docker-image approval-request-controller:latest --name hub | ||
|
|
||
| # Install the chart | ||
| helm install approval-request-controller ./charts/approval-request-controller \ | ||
| --namespace fleet-system \ | ||
| --create-namespace | ||
| ``` | ||
|
|
||
| ## Configuration | ||
|
|
||
| The controller watches for: | ||
| - `ApprovalRequest` (namespaced) | ||
| - `ClusterApprovalRequest` (cluster-scoped) | ||
|
|
||
| Both resources from kubefleet are monitored, and the controller creates `MetricCollector` resources on appropriate member clusters based on the staged update configuration. | ||
|
|
||
| ### Health Check Interval | ||
|
|
||
| The controller checks workload health every **15 seconds**. This interval is configurable via the `reconcileInterval` parameter in the Helm chart. | ||
|
|
||
| ## API Reference | ||
|
|
||
| ### WorkloadTracker | ||
|
|
||
| `WorkloadTracker` is a cluster-scoped custom resource that defines which workloads the approval controller should monitor for health metrics before auto-approving staged rollouts. | ||
|
|
||
| #### Example: Single Workload | ||
|
|
||
| ```yaml | ||
| apiVersion: metric.kubernetes-fleet.io/v1beta1 | ||
| kind: WorkloadTracker | ||
| metadata: | ||
| name: sample-workload-tracker | ||
| workloads: | ||
| - name: sample-metric-app | ||
| namespace: test-ns | ||
| ``` | ||
|
|
||
| #### Example: Multiple Workloads | ||
|
|
||
| ```yaml | ||
| apiVersion: metric.kubernetes-fleet.io/v1beta1 | ||
| kind: WorkloadTracker | ||
| metadata: | ||
| name: multi-workload-tracker | ||
| workloads: | ||
| - name: frontend | ||
| namespace: production | ||
| - name: backend-api | ||
| namespace: production | ||
| - name: worker-service | ||
| namespace: production | ||
| ``` | ||
|
|
||
| #### Usage Notes | ||
|
|
||
| - **Cluster-scoped:** WorkloadTracker is a cluster-scoped resource, not namespaced | ||
| - **Optional:** If no WorkloadTracker exists, the controller will skip health checks and won't auto-approve | ||
| - **Single instance:** The controller expects one WorkloadTracker per cluster and uses the first one found | ||
| - **Health criteria:** All workloads listed must report healthy (metric value = 1.0) before approval | ||
| - **Prometheus metrics:** Each workload should expose `workload_health` metrics that the MetricCollector can query | ||
|
|
||
| For a complete example, see: [`./examples/workloadtracker/workloadtracker.yaml`](./examples/workloadtracker/workloadtracker.yaml) | ||
|
|
||
| ## Additional Resources | ||
|
|
||
| - **Main Tutorial:** See [`../README.md`](../README.md) for a complete end-to-end tutorial on setting up automated staged rollouts with approval automation | ||
| - **Metric Collector:** See [`../metric-collector/README.md`](../metric-collector/README.md) for details on the metric collection component that runs on member clusters | ||
| - **KubeFleet Documentation:** [Azure/fleet](https://github.com/Azure/fleet) - Multi-cluster orchestration platform | ||
| - **Example Configurations:** | ||
| - [`./examples/workloadtracker/`](./examples/workloadtracker/) - WorkloadTracker resource examples | ||
| - [`./examples/stagedupdaterun/`](./examples/stagedupdaterun/) - Staged update configuration examples | ||
| - [`./examples/prometheus/`](./examples/prometheus/) - Prometheus deployment and configuration for metric collection | ||
| ``` |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| /* | ||
| Copyright 2025 The KubeFleet Authors. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
||
| // Package v1alpha1 contains API Schema definitions for the placement v1beta1 API group | ||
| // +kubebuilder:object:generate=true | ||
| // +groupName=metric.kubernetes-fleet.io | ||
| package v1alpha1 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| /* | ||
| Copyright 2025 The KubeFleet Authors. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
||
| // +kubebuilder:object:generate=true | ||
| // +groupName=metric.kubernetes-fleet.io | ||
| package v1alpha1 | ||
|
|
||
| import ( | ||
| "k8s.io/apimachinery/pkg/runtime/schema" | ||
| "sigs.k8s.io/controller-runtime/pkg/scheme" | ||
| ) | ||
|
|
||
| var ( | ||
| // GroupVersion is group version used to register these objects | ||
| GroupVersion = schema.GroupVersion{Group: "metric.kubernetes-fleet.io", Version: "v1alpha1"} | ||
|
|
||
| // SchemeBuilder is used to add go types to the GroupVersionKind scheme | ||
| SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion} | ||
|
|
||
| // AddToScheme adds the types in this group-version to the given scheme. | ||
| AddToScheme = SchemeBuilder.AddToScheme | ||
| ) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,104 @@ | ||
| /* | ||
| Copyright 2025 The KubeFleet Authors. | ||
|
|
||
| Licensed under the Apache License, Version 2.0 (the "License"); | ||
| you may not use this file except in compliance with the License. | ||
| You may obtain a copy of the License at | ||
|
|
||
| http://www.apache.org/licenses/LICENSE-2.0 | ||
|
|
||
| Unless required by applicable law or agreed to in writing, software | ||
| distributed under the License is distributed on an "AS IS" BASIS, | ||
| WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| See the License for the specific language governing permissions and | ||
| limitations under the License. | ||
| */ | ||
|
|
||
| package v1alpha1 | ||
|
|
||
| import ( | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| ) | ||
|
|
||
| // +genclient | ||
| // +kubebuilder:object:root=true | ||
| // +kubebuilder:subresource:status | ||
| // +kubebuilder:resource:scope="Namespaced",shortName=mcr,categories={fleet,fleet-metrics} | ||
| // +kubebuilder:storageversion | ||
| // +kubebuilder:printcolumn:JSONPath=`.status.workloadsMonitored`,name="Workloads",type=integer | ||
| // +kubebuilder:printcolumn:JSONPath=`.status.lastCollectionTime`,name="Last-Collection",type=date | ||
| // +kubebuilder:printcolumn:JSONPath=`.metadata.creationTimestamp`,name="Age",type=date | ||
|
|
||
| // MetricCollectorReport is created by the approval-request-controller on the hub cluster | ||
| // in the fleet-member-{clusterName} namespace. The metric-collector on the member cluster | ||
| // watches these reports and updates their status with collected metrics. | ||
| // | ||
| // Controller workflow: | ||
| // 1. Approval-controller creates MetricCollectorReport with spec on hub | ||
| // 2. Metric-collector watches MetricCollectorReport on hub (in fleet-member-{clusterName} namespace) | ||
| // 3. Metric-collector queries Prometheus on member cluster | ||
| // 4. Metric-collector updates MetricCollectorReport status on hub with collected metrics | ||
| // | ||
| // Namespace: fleet-member-{clusterName} | ||
| // Name: Matches the UpdateRun name | ||
| type MetricCollectorReport struct { | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why doesn't this CR have spec and status? Feel like Conditions should be part of the Status and WorkloadsMonitored should be part of the spec
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. MetricCollectorReport is just a information source in the current implementation hence no desired state (spec) and no correspodning status |
||
| metav1.TypeMeta `json:",inline"` | ||
| metav1.ObjectMeta `json:"metadata,omitempty"` | ||
|
|
||
| Spec MetricCollectorReportSpec `json:"spec,omitempty"` | ||
| Status MetricCollectorReportStatus `json:"status,omitempty"` | ||
| } | ||
|
|
||
| // MetricCollectorReportSpec defines the configuration for metric collection. | ||
| type MetricCollectorReportSpec struct { | ||
| // PrometheusURL is the URL of the Prometheus server on the member cluster | ||
| // Example: "http://prometheus.fleet-system.svc.cluster.local:9090" | ||
| PrometheusURL string `json:"prometheusUrl"` | ||
| } | ||
|
|
||
| // MetricCollectorReportStatus contains the collected metrics from the member cluster. | ||
| type MetricCollectorReportStatus struct { | ||
| // Conditions represent the latest available observations of the report's state. | ||
| // +optional | ||
| Conditions []metav1.Condition `json:"conditions,omitempty"` | ||
|
|
||
| // WorkloadsMonitored is the count of workloads being monitored. | ||
| // +optional | ||
| WorkloadsMonitored int32 `json:"workloadsMonitored,omitempty"` | ||
|
|
||
| // LastCollectionTime is when metrics were last collected on the member cluster. | ||
| // +optional | ||
| LastCollectionTime *metav1.Time `json:"lastCollectionTime,omitempty"` | ||
|
|
||
| // CollectedMetrics contains the most recent metrics from each workload. | ||
| // +optional | ||
| CollectedMetrics []WorkloadMetrics `json:"collectedMetrics,omitempty"` | ||
| } | ||
|
|
||
| // WorkloadMetrics represents metrics collected from a single workload pod. | ||
| type WorkloadMetrics struct { | ||
| // Namespace of the workload. | ||
| // +required | ||
| Namespace string `json:"namespace"` | ||
|
|
||
| // WorkloadName from the workload_health metric label. | ||
| // +required | ||
| WorkloadName string `json:"workloadName"` | ||
|
|
||
| // Health indicates if the workload is healthy (true=healthy, false=unhealthy). | ||
| // +required | ||
| Health bool `json:"health"` | ||
| } | ||
|
|
||
| // +kubebuilder:object:root=true | ||
|
|
||
| // MetricCollectorReportList contains a list of MetricCollectorReport. | ||
| type MetricCollectorReportList struct { | ||
| metav1.TypeMeta `json:",inline"` | ||
| metav1.ListMeta `json:"metadata,omitempty"` | ||
| Items []MetricCollectorReport `json:"items"` | ||
| } | ||
|
|
||
| func init() { | ||
| SchemeBuilder.Register(&MetricCollectorReport{}, &MetricCollectorReportList{}) | ||
| } | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Arvind! Just a nit: I fear that the name (
metric.kubernetes-fleet.io) might be a bit confusing.