Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Apply MetalLB configuration to remote cluster #783

Merged
merged 2 commits into from
Jul 4, 2024

Conversation

dlipovetsky
Copy link
Contributor

What problem does this PR solve?:

To apply MetalLB configuration, the MetalLB CRD webhooks must be ready. This is not guaranteed to happen once the MetalLB helm chart is successfully deployed.

Because the MetalLB configuration is applied in a non-blocking lifecycle hook, the topology controller retries the hook immediately after it returns; we cannot ask for a delay. Because the topology controller backs off its requests exponentially, this often results in the hook requests being delayed for 3+ minutes before succeeding.

To mitigate this exponential backoff, the hook retries internally for to up to 20 seconds, within the 30 seconds recommended by the Cluster API.

Related log excerpt
    I0628 19:39:31.051929       1 handler.go:132] "Deploying ServiceLoadBalancer provider MetalLB" cluster="default/dlipovetsky"
    I0628 19:39:31.051954       1 handler.go:82] "Applying MetalLB installation" cluster="default/dlipovetsky"
    I0628 19:39:31.100706       1 cm.go:82] "Fetching HelmChart info for \"metallb\" from configmap default/default-helm-addons-config" cluster="default/dlipovetsky"
    E0628 19:39:41.031497       1 handler.go:140] "failed to deploy ServiceLoadBalancer provider MetalLB" err="context canceled: last apply error: failed to apply MetalLB configuration IPAddressPool metallb-system/metallb: server-side apply failed: Patch \"https://172.18.0.2:6443/apis/metallb.io/v1beta1/namespaces/metallb-system/ipaddresspools/metallb?fieldManager=cluster-api-runtime-extensions-nutanix&fieldValidation=Strict&timeout=10s\": context canceled" cluster="default/dlipovetsky"
    ...
    E0628 19:43:49.496027       1 handler.go:140] "failed to deploy ServiceLoadBalancer provider MetalLB" err="context canceled: last apply error: failed to apply MetalLB configuration IPAddressPool metallb-system/metallb: server-side apply failed: Internal error occurred: failed calling webhook \"ipaddresspoolvalidationwebhook.metallb.io\": failed to call webhook: Post \"https://metallb-webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s\": dial tcp 10.130.84.48:443: connect: connection refused" cluster="default/dlipovetsky"
    I0628 19:43:51.785907       1 handler.go:132] "Deploying ServiceLoadBalancer provider MetalLB" cluster="default/dlipovetsky"
    I0628 19:43:51.785923       1 handler.go:82] "Applying MetalLB installation" cluster="default/dlipovetsky"
    I0628 19:43:51.790521       1 cm.go:82] "Fetching HelmChart info for \"metallb\" from configmap default/default-helm-addons-config" cluster="default/dlipovetsky"

Which issue(s) this PR fixes:
Fixes #

How Has This Been Tested?:

Special notes for your reviewer:

Copy link
Contributor

@dkoshkin dkoshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the stacked PRs, makes it very easy to review.

Wait up to 30s. In any case, we expect the context to be cancelled at 30s
@dlipovetsky dlipovetsky enabled auto-merge (squash) July 3, 2024 22:07
@dlipovetsky dlipovetsky merged commit b160a03 into main Jul 4, 2024
24 checks passed
@dlipovetsky dlipovetsky deleted the dlipovetsky/metallb-config branch July 4, 2024 09:36
@github-actions github-actions bot mentioned this pull request Jul 3, 2024
dkoshkin pushed a commit that referenced this pull request Jul 5, 2024
🤖 I have created a release *beep* *boop*
---


## 0.12.0 (2024-07-05)

<!-- Release notes generated using configuration in .github/release.yaml
at main -->

## What's Changed
### Exciting New Features 🎉
* feat: Add waiter for object by @dlipovetsky in
#777
* feat: Define ServiceLoadBalancer Configuration API by @dlipovetsky in
#778
* feat: Use HelmAddon as default addon strategy by @jimmidyson in
#771
* feat: Apply MetalLB configuration to remote cluster by @dlipovetsky in
#783
* feat: Update addon versions by @jimmidyson in
#785
### Fixes 🔧
* fix: Copy ClusterClasses and Templates without their owner references
by @dlipovetsky in
#776
* fix: Namespacesync controller should reconcile an updated namespace by
@dlipovetsky in
#775
* fix: use minimal image when deploying nfd chart by @faiq in
#774
### Other Changes
* build: Update release metadata.yaml by @jimmidyson in
#768
* ci: Run Nutanix provider e2e tests on self-hosted runner by
@jimmidyson in
#755
* build: Fix devbox run errors due to piped commands by @jimmidyson in
#773
* ci: Fix ct check by @jimmidyson in
#779
* build: Use go 1.22.5 toolchain to fix CVE by @jimmidyson in
#780
* test(e2e): Use mesosphere fork v1.7.3-d2iq.1 for CAPI providers by
@jimmidyson in
#781
* ci: Move govulncheck to nightly and push to main triggers by
@jimmidyson in
#782
* ci: Disable nix cache on self-hosted runners by @jimmidyson in
#786


**Full Changelog**:
v0.11.2...v0.12.0

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants