Gate scheduling on freshly created user nodes: add a temporary taint at Node creation, run a per‑node initialization DaemonSet, then remove the taint so regular workloads can schedule.
Warm up images / caches / extensions (or run validation) before user workloads land, without permanently reserving resources or modifying every workload spec.
- Mutating webhook (deploy/deployment.yaml) invokes
webhook.MutateNode
on Node CREATE and patches taint
startup.k8s.io/initializing=wait:NoSchedule
(skips AKS system pool nodes labeledkubernetes.azure.com/mode=system
). - Only DaemonSets (and any system components you explicitly patch) that tolerate the taint start.
- The init DaemonSet Pod (deploy/startup-daemonset.yaml) labeled
startup.k8s.io/component=init
performs warm‑up. - Controller
startup.Controller
watches Nodes & Pods. Readiness logic:startup.startupPodReady
(annotation shortcut or all containers Ready + PodReady). - When complete it removes the taint via
startup.removeStartupTaint
and writes completion annotation. - Workload Pods (no toleration) can now schedule.
Symbol | Purpose |
---|---|
webhook.MutateNode |
JSONPatch Node CREATE to add taint |
startup.Controller |
Informer-driven reconciler |
startup.HasStartupTaint |
Helper to detect taint presence |
startup.startupPodReady |
Determines if init Pod finished |
startup.removeStartupTaint |
Removes taint + annotates Node |
Constants: startup.TaintKey , startup.TaintValue , startup.StartPodLabelKey , startup.StartPodLabelValue , startup.StartPodReadyAnnotation , startup.NodeStartupCompletedAnnotation |
Contract for taint/labels/annotations |
Item | Value / Format | Set By |
---|---|---|
Startup taint | startup.k8s.io/initializing=wait:NoSchedule |
Webhook |
Init Pod label | startup.k8s.io/component=init |
DaemonSet template |
(Optional) Early ready annotation | startup.k8s.io/ready=true |
Init Pod logic |
Node completion timestamp | startup.k8s.io/completedAt=<unixEpoch> |
Controller |
Var | Effect |
---|---|
STARTUP_WEBHOOK=1 |
(Currently always started) serve webhook HTTPS |
STARTUP_BACKFILL=1 |
startup.backfillTaint retro-taints idle untainted nodes (no user pods) |
(Strict/hold options like annotation‑only or min hold time are not yet in code unless you extend it.)
main.go
pkg/
webhook/ (mutation handler)
startup/ (controller, constants, helpers, tests)
deploy/ (Kubernetes manifests & cert helper)
Dockerfile
Makefile
go build ./...
go test ./... -cover
Or via Make:
make build
make test
make docker-build DOCKER_TAG=v1.6
make docker-push DOCKER_TAG=v1.6
-
Generate TLS certs & patch CA bundle (updates Secret + MutatingWebhookConfiguration):
cd deploy ./generate_webhook_certs.sh \ --namespace kube-system \ --service node-startup-webhook \ --secret node-startup-webhook-tls \ --webhook node-startup-taint
-
Edit deploy/deployment.yaml:
- Set image
yourrepo/nodetaintshandler:<tag>
- (Optional) Adjust
failurePolicy
(currentlyFail
for strict gating) - Add env
STARTUP_BACKFILL=1
if you want missed nodes tainted (only when idle)
- Set image
-
Apply controller + webhook:
kubectl apply -f deploy/deployment.yaml
-
Apply init DaemonSet:
kubectl apply -f deploy/startup-daemonset.yaml
Customize its script to perform real warm‑up. Add a readiness probe or set the annotation when done if you modify logic.
-
(Optional) Patch only necessary system DaemonSets to tolerate the taint (avoid broad patch unless needed):
kubectl -n kube-system patch ds kube-proxy --type=json \ -p='[{"op":"add","path":"/spec/template/spec/tolerations/-","value":{"key":"startup.k8s.io/initializing","operator":"Equal","value":"wait","effect":"NoSchedule"}}]'
-
Verify:
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints | grep initializing kubectl logs -n kube-system deploy/nodetaintshandler | grep "Adding startup taint"
-
Observe taint removal:
kubectl logs -n kube-system deploy/nodetaintshandler | grep "Removed startup taint"
Deploy a workload (e.g. deploy/nginx-deploy.yaml) to trigger scaling:
kubectl apply -f deploy/nginx-deploy.yaml
Watch new nodes get tainted and held until the init Pod readiness condition.
- Webhook patch cases: pkg/webhook/node_webhook_test.go
- Controller readiness & removal paths: pkg/startup/controller_test.go
- Event handler helper: pkg/startup/handler_helpers_test.go
Run:
go test ./... -cover
Symptom | Likely Cause | Remedy |
---|---|---|
Workloads schedule before init Pod | Node missed mutation (webhook unavailable) or taint removed quickly | Ensure webhook Pod Ready before scaling; keep failurePolicy: Fail ; add readiness gating in init Pod |
Taint never removed | Init Pod never reaches Ready condition / annotation | Add readinessProbe or set annotation; inspect Pod status |
Backfill skipped node | Node already has user Pods | Manually decide if retro-taint is safe |
Webhook 404 / probe failing | TLS secret not mounted yet | Secret projection delay – startup code already waits; check logs |
Check failed webhook calls:
kubectl get events -A --sort-by=.lastTimestamp | grep -i webhook
Ideas:
- Require explicit annotation only (remove implicit PodReady path).
- Minimum taint hold time.
- Two‑phase taints (preinit -> warming).
- Validating webhook to block Pod admission if taint present without toleration.
- Metrics & structured logging (Prometheus / JSON).
- RBAC hardening (split read vs patch).
kubectl delete -f deploy/startup-daemonset.yaml
kubectl delete -f deploy/deployment.yaml
- No guarantee init DaemonSet Pod becomes the very first Pod (race with other tolerated DS).
- Without readinessProbe or annotation the init Pod may be “Ready” immediately (sleep).
- Backfill only covers idle nodes (avoids disrupting active ones).
MIT License