Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add waiter for object #777

Merged
merged 2 commits into from
Jul 3, 2024
Merged

feat: Add waiter for object #777

merged 2 commits into from
Jul 3, 2024

Conversation

dlipovetsky
Copy link
Contributor

What problem does this PR solve?:
Implements a wait for a check to pass against a typed object.

We'll use this in some lifecycle handers, e.g., in a future change for deploying ServiceLoadBalancer configuration to the remote cluster.

(This is a copy of #762. I had to close that after #755 added required checks that can't be run from PRs from public forks. )

Which issue(s) this PR fixes:
Fixes #

How Has This Been Tested?:

Special notes for your reviewer:

Implements a wait for a check to pass against a typed object.
- Retry get only if object is not found, fail immediately otherwise.
@jimmidyson jimmidyson enabled auto-merge (squash) July 3, 2024 16:20
@jimmidyson jimmidyson merged commit b499985 into main Jul 3, 2024
24 of 27 checks passed
@jimmidyson jimmidyson deleted the dlipovetsky/object-waiter branch July 3, 2024 16:32
@github-actions github-actions bot mentioned this pull request Jul 3, 2024
dlipovetsky added a commit that referenced this pull request Jul 4, 2024
**What problem does this PR solve?**:
- Generates MetalLB configuration from ServiceLoadBalancer Configuration
API (added in #778)
- Waits for MetalLB helm chart to be successfully deployed (using waiter
added in #777)
- Applies MetalLB configuration to remote cluster, also using a wait.

To apply MetalLB configuration, the MetalLB CRD webhooks must be ready.
This is not guaranteed to happen once the MetalLB helm chart is
successfully deployed.

Because the MetalLB configuration is applied in a non-blocking lifecycle
hook, the topology controller retries the hook immediately after it
returns; we cannot ask for a delay. Because the topology controller
backs off its requests exponentially, this often results in the hook
requests being delayed for 3+ minutes before succeeding.

To mitigate this exponential backoff, the hook retries internally for to
up to 20 seconds, within the 30 seconds recommended by the Cluster API.

<details>
<summary>Related log excerpt</summary>

```
    I0628 19:39:31.051929       1 handler.go:132] "Deploying ServiceLoadBalancer provider MetalLB" cluster="default/dlipovetsky"
    I0628 19:39:31.051954       1 handler.go:82] "Applying MetalLB installation" cluster="default/dlipovetsky"
    I0628 19:39:31.100706       1 cm.go:82] "Fetching HelmChart info for \"metallb\" from configmap default/default-helm-addons-config" cluster="default/dlipovetsky"
    E0628 19:39:41.031497       1 handler.go:140] "failed to deploy ServiceLoadBalancer provider MetalLB" err="context canceled: last apply error: failed to apply MetalLB configuration IPAddressPool metallb-system/metallb: server-side apply failed: Patch \"https://172.18.0.2:6443/apis/metallb.io/v1beta1/namespaces/metallb-system/ipaddresspools/metallb?fieldManager=cluster-api-runtime-extensions-nutanix&fieldValidation=Strict&timeout=10s\": context canceled" cluster="default/dlipovetsky"
    ...
    E0628 19:43:49.496027       1 handler.go:140] "failed to deploy ServiceLoadBalancer provider MetalLB" err="context canceled: last apply error: failed to apply MetalLB configuration IPAddressPool metallb-system/metallb: server-side apply failed: Internal error occurred: failed calling webhook \"ipaddresspoolvalidationwebhook.metallb.io\": failed to call webhook: Post \"https://metallb-webhook-service.metallb-system.svc:443/validate-metallb-io-v1beta1-ipaddresspool?timeout=10s\": dial tcp 10.130.84.48:443: connect: connection refused" cluster="default/dlipovetsky"
    I0628 19:43:51.785907       1 handler.go:132] "Deploying ServiceLoadBalancer provider MetalLB" cluster="default/dlipovetsky"
    I0628 19:43:51.785923       1 handler.go:82] "Applying MetalLB installation" cluster="default/dlipovetsky"
    I0628 19:43:51.790521       1 cm.go:82] "Fetching HelmChart info for \"metallb\" from configmap default/default-helm-addons-config" cluster="default/dlipovetsky"
```

</details>

**Which issue(s) this PR fixes**:
Fixes #

**How Has This Been Tested?**:
<!--
Please describe the tests that you ran to verify your changes.
Provide output from the tests and any manual steps needed to replicate
the tests.
-->

**Special notes for your reviewer**:
<!--
Use this to provide any additional information to the reviewers.
This may include:
- Best way to review the PR.
- Where the author wants the most review attention on.
- etc.
-->
dkoshkin pushed a commit that referenced this pull request Jul 5, 2024
🤖 I have created a release *beep* *boop*
---


## 0.12.0 (2024-07-05)

<!-- Release notes generated using configuration in .github/release.yaml
at main -->

## What's Changed
### Exciting New Features 🎉
* feat: Add waiter for object by @dlipovetsky in
#777
* feat: Define ServiceLoadBalancer Configuration API by @dlipovetsky in
#778
* feat: Use HelmAddon as default addon strategy by @jimmidyson in
#771
* feat: Apply MetalLB configuration to remote cluster by @dlipovetsky in
#783
* feat: Update addon versions by @jimmidyson in
#785
### Fixes 🔧
* fix: Copy ClusterClasses and Templates without their owner references
by @dlipovetsky in
#776
* fix: Namespacesync controller should reconcile an updated namespace by
@dlipovetsky in
#775
* fix: use minimal image when deploying nfd chart by @faiq in
#774
### Other Changes
* build: Update release metadata.yaml by @jimmidyson in
#768
* ci: Run Nutanix provider e2e tests on self-hosted runner by
@jimmidyson in
#755
* build: Fix devbox run errors due to piped commands by @jimmidyson in
#773
* ci: Fix ct check by @jimmidyson in
#779
* build: Use go 1.22.5 toolchain to fix CVE by @jimmidyson in
#780
* test(e2e): Use mesosphere fork v1.7.3-d2iq.1 for CAPI providers by
@jimmidyson in
#781
* ci: Move govulncheck to nightly and push to main triggers by
@jimmidyson in
#782
* ci: Disable nix cache on self-hosted runners by @jimmidyson in
#786


**Full Changelog**:
v0.11.2...v0.12.0

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants