Skip to content

Conversation

pooknull
Copy link
Contributor

@pooknull pooknull commented Sep 29, 2025

K8SPS-498 Powered by Pull Request Badge

https://perconadev.atlassian.net/browse/K8SPS-498

DESCRIPTION

Problem:
While the cluster is initializing, the cluster's resourceVersion is constantly updated.

Cause:
This happens because status conditions are continuously updated with an incorrect latestTransitionTime. The reconcileCRStatus method is used in multiple places in the reconcile loop, and whenever it is called, it updates the status conditions with an outdated/incorrect latestTransitionTime, based on the cluster passed to the method.

Solution:
Always fetch the latest cluster in the reconcileCRStatus method.

Note for developers: I removed LatestTransitionTime from the condition objects because it is already managed by the meta.SetStatusCondition function

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PS version?
  • Does the change support oldest and newest supported Kubernetes version?

@pull-request-size pull-request-size bot added the size/S 10-29 lines label Sep 29, 2025
Comment on lines +37 to +39
if err := r.Get(ctx, client.ObjectKeyFromObject(cr), cr); err != nil {
return errors.Wrap(err, "get cluster")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will overwrite anything we assign to status fields in Reconcile function, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it shouldn't cause any issues since we overwrite each field of the status in the reconcileCRStatus method

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure if it's okay, users test failure might be caused by this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hors hors added this to the v1.0.0 milestone Sep 30, 2025
@pull-request-size pull-request-size bot added size/M 30-99 lines and removed size/S 10-29 lines labels Sep 30, 2025
@pooknull pooknull marked this pull request as ready for review September 30, 2025 12:45
gkech
gkech previously approved these changes Sep 30, 2025
Copy link
Contributor

@gkech gkech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems fine, also tested it locally and I don't see the init spam of k get ps -w

@Copilot Copilot AI review requested due to automatic review settings October 3, 2025 15:06
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses an issue where cluster resourceVersion was constantly being updated during initialization due to incorrect LastTransitionTime handling in status conditions. The fix ensures that the latest cluster state is fetched before updating status conditions and removes manual LastTransitionTime management since it's handled automatically by meta.SetStatusCondition.

Key changes:

  • Added cluster fetching in reconcileCRStatus to ensure latest state is used
  • Removed manual LastTransitionTime assignments from condition objects
  • Added status write operation after MySQL version reconciliation

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/controller/ps/status.go Fetches latest cluster state and removes manual LastTransitionTime
pkg/controller/ps/controller.go Removes manual LastTransitionTime from bootstrap status
pkg/controller/psrestore/controller.go Removes manual LastTransitionTime and reorganizes imports
pkg/controller/ps/version.go Adds status write operation after version update
pkg/controller/ps/status_test.go Adds test case for empty cluster scenario

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

"sync"
"time"

"github.com/pkg/errors"
Copy link
Preview

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The import reorganization moves a commonly used package to the top, but standard Go practice is to group standard library imports first, then third-party imports, then local imports. Consider keeping the original import grouping structure.

Copilot uses AI. Check for mistakes.

@JNKPercona
Copy link
Collaborator

Test Name Result Time
version-service-8-4 passed 00:12:36
async-ignore-annotations-8-4 passed 00:06:23
async-global-metadata-8-4 failure 00:07:35
auto-config-8-4 passed 00:24:21
config-8-4 passed 00:16:29
config-router-8-0 passed 00:07:17
config-router-8-4 passed 00:07:27
demand-backup-minio-8-0 passed 00:19:48
demand-backup-minio-8-4 passed 00:20:06
demand-backup-cloud-8-4 passed 00:22:32
async-data-at-rest-encryption-8-0 passed 00:17:36
async-data-at-rest-encryption-8-4 passed 00:14:48
gr-global-metadata-8-4 passed 00:18:17
gr-data-at-rest-encryption-8-0 passed 00:15:00
gr-data-at-rest-encryption-8-4 passed 00:15:26
gr-demand-backup-minio-8-4 passed 00:12:46
gr-demand-backup-cloud-8-4 passed 00:22:00
gr-demand-backup-haproxy-8-4 passed 00:10:40
gr-finalizer-8-4 passed 00:05:33
gr-haproxy-8-0 passed 00:04:36
gr-haproxy-8-4 passed 00:04:17
gr-ignore-annotations-8-4 passed 00:05:10
gr-init-deploy-8-0 passed 00:09:36
gr-init-deploy-8-4 passed 00:08:40
gr-one-pod-8-4 passed 00:06:16
gr-recreate-8-4 passed 00:17:17
gr-scaling-8-4 passed 00:08:34
gr-scheduled-backup-8-4 passed 00:18:18
gr-security-context-8-4 passed 00:09:33
gr-self-healing-8-4 passed 00:22:27
gr-tls-cert-manager-8-4 passed 00:09:10
gr-users-8-4 passed 00:05:36
haproxy-8-0 passed 00:08:53
haproxy-8-4 passed 00:07:52
init-deploy-8-0 passed 00:05:47
init-deploy-8-4 passed 00:05:40
limits-8-4 passed 00:06:32
monitoring-8-4 passed 00:13:44
one-pod-8-0 passed 00:07:10
one-pod-8-4 passed 00:06:09
operator-self-healing-8-4 passed 00:12:01
pvc-resize-8-4 passed 00:08:16
recreate-8-4 passed 00:12:32
scaling-8-4 passed 00:11:10
scheduled-backup-8-0 passed 00:17:28
scheduled-backup-8-4 passed 00:16:34
service-per-pod-8-4 passed 00:06:30
sidecars-8-4 passed 00:04:37
smart-update-8-4 passed 00:09:32
storage-8-4 passed 00:04:03
telemetry-8-4 passed 00:06:25
tls-cert-manager-8-4 passed 00:10:26
users-8-0 passed 00:08:21
users-8-4 passed 00:07:47
Summary Value
Tests Run 54/54
Job Duration 01:56:20
Total Test Time 10:04:01

commit: d2aa69b
image: perconalab/percona-server-mysql-operator:PR-1102-d2aa69ba

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M 30-99 lines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants