-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial openstack magnum terraform config #5518
base: main
Are you sure you want to change the base?
Conversation
@julianpistorius, do you have any ideas about why this error might be showing up when creating the nodepools? 🤔 The cluster is created successfully, but the nodegroups fail with the error above. |
Hmm... No idea. I'll have to dig a bit. |
While I dig, do you mind trying again? We had a networking problem this morning, possibly around the same time you had run into this problem. |
We are still seeing intermittent problems with networking. Please stand by. |
Never mind. The networking problems had been solved, so you should be able to try again. |
Thanks for looking into it @julianpistorius! I've tried again today and it fails with an error which I think indicates that communication with the management cluster failed? I see the error both when I try to re-apply the terraform but also when trying to delete the cluster or nodegroups with
|
It looks like if I want to create only one nodepool via It might be some kind of race condition somewhere. |
It looks like I can force the nodegroups to fire create requests in sequence with Update: |
@julianpistorius, given the namespace in the command below that is ran by terraform when trying to create a new nodegroup and fails:
It makes me think that the command is ran against the management cluster, right? Do you any ideas where in there might be a possible race condition that could cause this? I found this old issue from when |
Hi @GeorgianaElena. I'll ask somebody who should know and get back to you. |
Thank you @julianpistorius! For now, I've learnt that I can unblock myself by telling terraform to disable creation of resource in parallel with |
Great! So according to Scott from StackHPC (@sd109) you have apparently hit a known bug in Magnum:
They'll work on fixing it. In the meantime your workaround will get you by. |
Thank you @julianpistorius! I think I came across the bug report today too #5455 (comment) Now I'm battling a different issue. The labels I set on the nodegroups through terraform are not propagated to the actual node instances and scheduling fails. |
Have you been able to set the labels manually using the |
Any ideas @sd109? |
Is the terraform trying to set the labels at the time of creating the nodegroups? Or afterwards on existing nodegroups? |
@julianpistorius, I believe it does it during creation. Based the code, they do it all in one request, i.e. they pass the labels in the post request creating the nodegroups. For context, I can see the labels on the actual nodegroups as
![]() But then when I run |
Interesting. I'm going to see if I can reproduce this using the OpenStack Magnum CLI. |
This is indeed a bug (with a workaround) as I mentioned in #5455 (comment) |
This is a work in progress for #5455. Everything is in one file for simplicity, harcoded nodegroups also for simplicity until the create command passes.
Min count of nodegroups cannot be 0 :(
Nodegroup creation currently fails with:
Also, when redeploying, the failed nodegroups cannot be deleted and they just hang and the error is the same.