fix: update status with up-to-date agent image #24

starbops · 2024-02-21T10:10:39Z

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Problem:

The controller does not upgrade agent Pods after the controller is upgraded to a newer version.

Solution:

Make sure the ippool-register control loop always update the image handle of AgentPodRef for each IPPool object being reconciled so that the ippool-agent-monitor control loop could remove the obsolete agent Pods according to that information.

Related Issue:

harvester/harvester#5188

Test plan:

For developers who want to verifying the fix:

Checkout the branch and build the artifacts

export REPO=<your-handle>
export PUSH=true
make

Prepare a Harvester cluster (single node would be fine) in v1.3.0-rc3

Install the harvester-vm-dhcp-controller experimental add-on with the default value content (should be an empty string) on the cluster

kubectl apply -f https://raw.githubusercontent.com/harvester/experimental-addons/main/harvester-vm-dhcp-controller/harvester-vm-dhcp-controller.yaml

Enable the add-on
Make sure you have a valid cluster network, network config, and VM network defined already

Create an IPPool object for the VM network, for example:

$ cat <<EOF | kubectl apply -f -
apiVersion: network.harvesterhci.io/v1alpha1
kind: IPPool
metadata:
  name: test-net
  namespace: default
spec:
  ipv4Config:
    serverIP: 192.168.0.2
    cidr: 192.168.0.0/24
    pool:
      start: 192.168.0.101
      end: 192.168.0.200
  networkName: default/test-net
EOF

Wait for the IPPool object becomes ready (AgentReady==True)

Edit the harvester-vm-dhcp-controller Addon object to upgrade to the version you built in step 1, for example:

  ...
  valueContent: |
    image:
      repository: starbops/harvester-vm-dhcp-controller
      tag: fix-5188-head
    agent
      image:
        repository: starbops/harvester-vm-dhcp-agent
        tag: fix-5188-head
    webhook
      image:
        repository: starbops/harvester-vm-dhcp-webhook
        tag: fix-5188-head

Observe the agent Pod for the IPPool object is upgraded after the controller upgraded (by checking its image repository and tag)

For QAs, there's no need to custom-build the artifacts. Just verify if the fix works by using the main-head tag (since the PR will have been merged then). QAs can start from step 2.

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

w13915984028 · 2024-02-22T09:34:04Z

pkg/controller/ippool/controller.go

@@ -281,6 +281,8 @@ func (h *Handler) DeployAgent(ipPool *networkv1.IPPool, status networkv1.IPPoolS
 	}

 	if ipPool.Status.AgentPodRef != nil {
+		status.AgentPodRef.Image = h.agentImage.String()


Do we allow the agent image to be not replaced per kind of annotations, for easy debug ?

Seems not; please consider add;
and, better to add a check of deletion timestamp before call h.podClient.Delete, as last time deleting may not finish yet.

func (h *Handler) MonitorAgent(ipPool *networkv1.IPPool, status networkv1.IPPoolStatus) (networkv1.IPPoolStatus, error) { ... if agentPod.GetUID() != ipPool.Status.AgentPodRef.UID || agentPod.Spec.Containers[0].Image != ipPool.Status.AgentPodRef.Image { return status, h.podClient.Delete(agentPod.Namespace, agentPod.Name, &metav1.DeleteOptions{}) }

Currently none. We could support adding an annotation per IPPool object to individually hold back the upgrade for agent.

Thanks for the input!

Yu-Jack

Tested it, it works well and I'm curious about a question, do we need to add ownerReference in agent pod?

One NIT: I didn't notice the MonitorAgent this naming will delete unexpected pods, I thought it just collected something like logs or metrics. How is it if add some comments like MonitorAgent checks the agent pod's image information is same as IPPool object. If it's not same, delete agent pod

starbops · 2024-02-23T07:22:32Z

@Yu-Jack Thanks for your time. IIRC, the built-in OwnerReference does not support referencing objects in other namespaces. So we cannot simply add the reference and let Kubernetes handle the cascade deletion.

ref: https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/#owner-references-in-object-specifications

For MonitorAgent, I agree that the name is inaccurate and misleading. If you have ideas about it, I'm happy to change it. Anyway, I will add some comments for that control loop.

Yu-Jack · 2024-02-23T08:14:16Z

I have a idea, we could merge MonitorAgent logic into DeployAgent. Let DeployAgent handle all logic of creating and deleting pod. How is it? But, it's not real urgent thing, we could open another issue to refactor it if it's doable.

starbops · 2024-02-23T08:40:37Z

I was once thinking about the idea you proposed. Still, later, I broke them into two control loops, i.e., DeployAgent and MonitorAgent, because I didn't want to do synchronous retries in one loop (it will block the thread). The MonitorAgent's duty is simple: loop until the agent Pod is ready. It has a specific condition, AgentReady, that represents the status of the control loop. On the other hand, DeployAgent, as its name suggests, is supposed to "create" agent Pods but not wait for them until they're ready. The Registered condition implies the presence of the associated agent Pod, not readiness of them.

This is more like design concepts for Kubernetes controllers. I'm trying to avoid state machine-like design and approaching the idea of orthogonality, if that makes sense.

Reference: What the heck are Conditions in Kubernetes controllers? | maelvls dev blog

Yu-Jack · 2024-02-23T09:09:25Z

It makes more sense, thanks!

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

w13915984028

LGTM, thanks.

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

Yu-Jack

LGTM, thanks!

fix: update status with up-to-date agent image

1f991fc

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

starbops marked this pull request as ready for review February 22, 2024 01:54

starbops requested review from bk201 and w13915984028 February 22, 2024 01:57

starbops mentioned this pull request Feb 22, 2024

[BUG] Existing agent Pods are not upgraded after harvester-vm-dhcp-controller is upgraded harvester/harvester#5188

Closed

starbops requested a review from Yu-Jack February 22, 2024 03:57

w13915984028 reviewed Feb 22, 2024

View reviewed changes

Yu-Jack reviewed Feb 23, 2024

View reviewed changes

starbops added 2 commits February 24, 2024 01:07

fix: check agent pod deletiontimestamp before deleting

4f94271

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

feat: support adding annotation to hold agent upgrade

f57f8df

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

starbops requested a review from w13915984028 February 23, 2024 18:26

w13915984028 approved these changes Feb 23, 2024

View reviewed changes

doc: add comments for major ippool reconcile loops

1b23627

Signed-off-by: Zespre Chang <zespre.chang@suse.com>

Yu-Jack approved these changes Feb 26, 2024

View reviewed changes

starbops merged commit e401ece into harvester:main Feb 26, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: update status with up-to-date agent image #24

fix: update status with up-to-date agent image #24

starbops commented Feb 21, 2024 •

edited

Loading

w13915984028 Feb 22, 2024 •

edited

Loading

starbops Feb 22, 2024

Yu-Jack left a comment •

edited

Loading

starbops commented Feb 23, 2024 •

edited

Loading

Yu-Jack commented Feb 23, 2024 •

edited

Loading

starbops commented Feb 23, 2024

Yu-Jack commented Feb 23, 2024

w13915984028 left a comment

Yu-Jack left a comment

fix: update status with up-to-date agent image #24

fix: update status with up-to-date agent image #24

Conversation

starbops commented Feb 21, 2024 • edited Loading

w13915984028 Feb 22, 2024 • edited Loading

Choose a reason for hiding this comment

starbops Feb 22, 2024

Choose a reason for hiding this comment

Yu-Jack left a comment • edited Loading

Choose a reason for hiding this comment

starbops commented Feb 23, 2024 • edited Loading

Yu-Jack commented Feb 23, 2024 • edited Loading

starbops commented Feb 23, 2024

Yu-Jack commented Feb 23, 2024

w13915984028 left a comment

Choose a reason for hiding this comment

Yu-Jack left a comment

Choose a reason for hiding this comment

starbops commented Feb 21, 2024 •

edited

Loading

w13915984028 Feb 22, 2024 •

edited

Loading

Yu-Jack left a comment •

edited

Loading

starbops commented Feb 23, 2024 •

edited

Loading

Yu-Jack commented Feb 23, 2024 •

edited

Loading