-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Make changes to updater to add the unboosting logic #8618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: experimental-cpu-boost
Are you sure you want to change the base?
Make changes to updater to add the unboosting logic #8618
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: kamarabbas99 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @kamarabbas99. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cc laoj2 |
/ok-to-test |
6286cc8
to
79460ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good! Just a few minor nits from me. I’ll do one more pass on this as well :)
} | ||
err := inPlaceLimiter.InPlaceUpdate(pod, vpa, u.eventRecorder) | ||
if err != nil { | ||
klog.V(0).InfoS("Unboosting failed", "error", err, "pod", klog.KObj(pod)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the unboosting operation fails each time? don't we want to fallback to eviction at some point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont we risk having an eviction loop in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do. I need to look at the AEP if we discussed about it or not, but if we don't have an eviction there could be a case that the boosted workload will keep being boosted. maybe it's ok to leave it that way but we need to properly write this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
79460ad
to
302be43
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far looks good, I have 2 small comments
func PodReady(pod *core.Pod) bool { | ||
for _, cond := range pod.Status.Conditions { | ||
if cond.Type == core.PodReady && cond.Status == core.ConditionTrue { | ||
return true | ||
} | ||
} | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: could this be named IsPodReady
?
k/k has a few IsPodReady functions, and the small bit of consistency may be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't it make sense to use the upstream function then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem public, I couldn't get it to work, but may be I was doing something wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Erm, my bad! It's available to use. I agree, let's use it.
if !PodReady(pod) { | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is checking for readiness and startup time duration. Does it make sense to remove the readiness check? I like it as a guard, but it seems surprising.
Or may be rename the function to include that the readiness is true?
cc @omerap12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove this check we have to rename this function for future use.
We already check if the pod is ready each time we use this function:
- https://github.com/kubernetes/autoscaler/pull/8618/files/302be4374a6e9f4e0909df7bb985d6a10c2ae809#diff-861cbdb52ed6e832a7821c3f6fe4d18123631b360f5042015cb5e8cadfa625b4R43
- https://github.com/kubernetes/autoscaler/pull/8618/files/302be4374a6e9f4e0909df7bb985d6a10c2ae809#diff-b864a8cf8e133025f0a68284881b2abfda1a098347250e322b8ed2b61b7590cfR252
But I worry for other folks who maybe want to use this function and won't check if the pod is ready
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Introduces changes in the updater component to unboost cpu if its applied by the admission-controller.
Also the original cpu request is added in annotation because we need to revert to original(not recommeded) if update mode is set to Off.
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: