[WIP] Change Deployment UpdateStrategy #273

kattz-kawa · 2025-11-18T14:51:22Z

Why we need this PR

After new placement rule introduced in Run manager with 2 replicas #180, the manager pods can only be scheduled on two specific nodes. Under these conditions, the default strategy may cause rollout deadlocks during node recovery.
Certificate rotation managed by OLM can trigger Deployment rollouts by changing the olmcahash annotation. With the current strategy, this may lead to deadlocks if a node is down during recovery, causing rollout failures. To improve reliability and ensure safe updates under these conditions, we need a more controlled approach.

Changes made

Added RollingUpdate strategy with maxSurge: 0 and maxUnavailable: 1 for safer deployment updates.
Before (default): When a rollout starts, the Deployment tries to keep the old pod until the new one is ready. On clusters with only 2 nodes, this can fail because there is no room to schedule the new pod, leading to a stuck rollout.
After: Pods are updated one at a time. The controller deletes one pod first, then creates a new one, ensuring that rollout can proceed safely even when only 2 nodes are available.

Which issue(s) this PR fixes

RHWA-366

Summary by CodeRabbit

Chores
- Updated deployment strategy configuration for controller-manager components to implement RollingUpdate strategy with specified constraints on pod surges and allowed unavailability. These operational improvements ensure more controlled and stable application updates while maintaining better service availability throughout upgrade processes.

…vailable=1) to mitigate rollout deadlocks during automatic certificate rotation by OLM Signed-off-by: Katsuya Kawakami <kkawakami-je@nec.com>

openshift-ci · 2025-11-18T14:51:31Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kattz-kawa
Once this PR has been reviewed and has the lgtm label, please assign razo7 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2025-11-18T14:51:33Z

Hi @kattz-kawa. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

coderabbitai · 2025-11-18T14:51:36Z

📝 Walkthrough

Walkthrough

Two YAML manifest files are updated to configure the controller-manager deployment strategy from default to explicit RollingUpdate with maxSurge: 0 and maxUnavailable: 1, controlling pod surge and availability during rollouts.

Changes

Cohort / File(s)	Summary
Deployment Strategy Configuration `bundle/manifests/self-node-remediation.clusterserviceversion.yaml`, `config/manager/manager.yaml`	Adds deployment strategy block with RollingUpdate type, maxSurge: 0, and maxUnavailable: 1 to control rollout behavior during updates

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Simple, repetitive configuration changes applied consistently across two files
No behavioral logic, error handling, or conditional changes
Straightforward verification of strategy parameters

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Change Deployment UpdateStrategy' clearly and concisely summarizes the main change—modifying the Deployment update strategy configuration.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d80706e and a7f0d7a.

📒 Files selected for processing (2)

bundle/manifests/self-node-remediation.clusterserviceversion.yaml (1 hunks)
config/manager/manager.yaml (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build

🔇 Additional comments (2)

config/manager/manager.yaml (1)

20-24: LGTM! Strategy configuration correctly addresses the 2-node cluster rollout deadlock.

The RollingUpdate strategy with maxSurge: 0 and maxUnavailable: 1 ensures that only one pod is replaced at a time, allowing new pods to be scheduled despite the topology constraints (lines 30–36) that restrict manager pods to specific nodes. This aligns with the operator's requirement to prevent rollout deadlocks during node recovery or certificate rotation.

Please confirm that both this file (config/manager/manager.yaml) and the OLM bundle manifest are kept in sync by a build/generation step, so future updates to one are automatically reflected in the other.

bundle/manifests/self-node-remediation.clusterserviceversion.yaml (1)

276-280: LGTM! Strategy configuration matches the intent defined in the source deployment manifest.

The nested rollingUpdate object and RollingUpdate type are correctly configured. The field ordering differs slightly from config/manager/manager.yaml (fields reversed), but this is inconsequential for YAML/Kubernetes semantics.

Verify that a build or generation tool automatically updates this bundle manifest when config/manager/manager.yaml changes, to maintain consistency and prevent manual sync errors in future updates.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Change Deployment updateStrategy to RollingUpdate (maxSurge=0, maxUna…

a7f0d7a

…vailable=1) to mitigate rollout deadlocks during automatic certificate rotation by OLM Signed-off-by: Katsuya Kawakami <kkawakami-je@nec.com>

openshift-ci bot added the do-not-merge/work-in-progress label Nov 18, 2025

openshift-ci bot added the needs-ok-to-test label Nov 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Change Deployment UpdateStrategy #273

[WIP] Change Deployment UpdateStrategy #273

Uh oh!

kattz-kawa commented Nov 18, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

openshift-ci bot commented Nov 18, 2025

Uh oh!

openshift-ci bot commented Nov 18, 2025

Uh oh!

coderabbitai bot commented Nov 18, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[WIP] Change Deployment UpdateStrategy #273

Are you sure you want to change the base?

[WIP] Change Deployment UpdateStrategy #273

Uh oh!

Conversation

kattz-kawa commented Nov 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why we need this PR

Changes made

Which issue(s) this PR fixes

Summary by CodeRabbit

Uh oh!

openshift-ci bot commented Nov 18, 2025

Uh oh!

openshift-ci bot commented Nov 18, 2025

Uh oh!

coderabbitai bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kattz-kawa commented Nov 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 18, 2025 •

edited

Loading