-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubernetes.core.k8s_drain: drain can get stuck because pods are evicted in order #711
Comments
What is the error message you get when trying to drain the node? |
With the ansible module I get the
The output from
My understanding from these messages is, that |
It's likely that kubectl eventually succeeds because it just keeps retrying the eviction. The k8s_drain module does not retry, it just makes a single eviction request for each pod on the node. I don't have any experience with longhorn, but longhorn/longhorn#5910 seems to point to new settings that will handle evicting the instance manager. You would need to use retries to get the module to keep trying until longhorn has finished with the eviction. One way or another, the pod disruption budget will have to be met before a node can be drained. |
So I found this thread after running into the same issue but finding that I can successfully evict pods with
The pods do eventually evict after several retries but this is all handled by
I suppose an additional alternative would be to just use |
SUMMARY
Draining a node can get stuck because pods are evicted in order and not asynchronous. kubectl drains the pods asynchronous.
This is a problem if pods have dependencies on each other. Eg. like in #474 the longhorn instance manger can only be evicted after pods using a longhorn volume have been evicted (at least as far as I understand this). kubectl retries until the pod can be evicted, this seems to be missing in the ansible module.
Note
longhorn fixed the need for
--pod-selector='app!=csi-attacher,app!=csi-provisioner'
forkubectl drain ...
. But it can still be used as workaround in the ansible module.I think it still makes sense to keep the functionality of this module close to
kubectl drain ...
.ISSUE TYPE
COMPONENT NAME
kubernetes.core.k8s_drain
ANSIBLE VERSION
COLLECTION VERSION
CONFIGURATION
OS / ENVIRONMENT
k3s on arch
STEPS TO REPRODUCE
Sorry I don't have an easy way to reproduce this. You need a pod that can not be evicted.
EXPECTED RESULTS
As many pods as possible should be evicted.
ACTUAL RESULTS
The eviction process gets stuck on pods that can not be evicted at that moment.
The text was updated successfully, but these errors were encountered: