Skip to content

Conversation

@britaniar
Copy link
Collaborator

@britaniar britaniar commented Dec 10, 2025

Description of your changes

I have:

  • Added a check to make sure no cluster start updating and let currently updating cluster finish within a stage before marking the stage and update run as stopped.

  • Update the stage condition status as stopping or stopped.

  • Update integration tests and added UTs.

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

  • Integration Test
  • Unit Test

Special notes for your reviewer

While waiting for cluster to finish updating, stage and update run will have a progressing unknown condition with the reason as stopping.

@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 84.78261% with 21 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pkg/controllers/updaterun/stop.go 83.89% 13 Missing and 6 partials ⚠️
pkg/controllers/updaterun/controller.go 88.23% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
@britaniar britaniar marked this pull request as ready for review December 11, 2025 17:23
Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
return runtime.Result{}, nil
}

func (r *Reconciler) handleIncompleteUpdateRun(ctx context.Context, updateRun placementv1beta1.UpdateRunObj, waitTime time.Duration, err error, state placementv1beta1.State, runObjRef klog.ObjectRef) (runtime.Result, error) {
Copy link
Collaborator

@Arvindthiru Arvindthiru Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rename it to handleIncompleteUpdateRun, I misunderstood the flow. handleIncompleteUpdateRun makes more sense

if finishedClusterCount == 0 {
markStageUpdatingStarted(updatingStageStatus, updateRun.GetGeneration())
}
markStageUpdatingStarted(updatingStageStatus, updateRun.GetGeneration())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we always setting markStageUpdatingStarted now?

Copy link
Collaborator Author

@britaniar britaniar Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was previous false it will update to true (progressing), but it is already true then it won't do anything. But this makes sure it has been marked progressing true when we find any updating cluster.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked about this with Wantong and this was what she had suggested doing. If we keep the previous check then when we restart and we have clusters that have succeed then the finished cluster count is not equal to 0, but we need to mark it as starting since the stage is resuming (progressing).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we set Started condition to false for a stage ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are trying to stop ("stopping") the progressing condition is unknown, and when it is done stopping ("stopped") the progressing condition is false for the stage and the update run. Only the stage that in the middle of being updated when stopped is updated with these conditions.

Copy link
Collaborator

@Arvindthiru Arvindthiru Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rename markStageUpdatingStarted to markStageUpdatingProgressStarted, I got confused by the name and thought stage has a Started condition but in this case we set Progressing condition to true with Started reason

Signed-off-by: Britania Rodriguez Reyes <britaniar@microsoft.com>
})
}

func checkIfErrorStagedUpdateAborted(err error, updateRun placementv1beta1.UpdateRunObj, updatingStageStatus *placementv1beta1.StageUpdatingStatus) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should move this to a common util file since it's used across execution, stop states

for i := 0; i < len(updatingStageStatus.Clusters); i++ {
clusterStatus := &updatingStageStatus.Clusters[i]
clusterStartedCond := meta.FindStatusCondition(clusterStatus.Conditions, string(placementv1beta1.ClusterUpdatingConditionStarted))
if clusterStartedCond == nil || condition.IsConditionStatusFalse(clusterStartedCond, updateRun.GetGeneration()) {
Copy link
Collaborator

@Arvindthiru Arvindthiru Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

klog.ErrorS(unexpectedErr, "The binding should be deleting before we mark a cluster deleting", "clusterStatus", curCluster, "updateRun", updateRunRef)
return false, fmt.Errorf("%w: %s", errStagedUpdatedAborted, unexpectedErr.Error())
}
return false, nil
Copy link
Collaborator

@Arvindthiru Arvindthiru Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we return here ?, we just know that this binding is deleting at the moment why not check other toBeDeletedBindings ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants