-
I am just learning and starting with deploying my first ever own cluster but failing with the following errors:
Here is my kube config file:
I have reduced the default amount of the nodes because of the early IP and resources limit of my new Hetzner account. I have also destroyed the cluster with the terraform command and tried again, but with no success. Configuration seems to be valid:
The servers seem to be all in rescue mode at this stage. Thank you very much. |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 2 replies
-
The rescue system is running Debian. But the failing script on the node is assuming an openSUSE system I guess:
Or maybe the node shouldn't be in rescue mode at this stage. I've also tried to use the latest terraform provider hcloud v1.36.2 without a difference. |
Beta Was this translation helpful? Give feedback.
-
@tbabut You need to destroy everything properly and try again, it's super useful to use the hcloud cli to check things up and even delete hanged ressources. Also, you can shoot a support request to hetzner to up the limits. |
Beta Was this translation helpful? Give feedback.
-
Thank you for your response. I have destroyed everything and started over again multiple times. I have also checked via UI and hcloud-cli, wether everything is really vanished after running the terraform command. Now two hours later I tried again, but it failed with the same errors. Somehow the nodes are in rescue mode in this stage, but they aren't supposed to be, right? If I reboot the servers, they come up fine with openSUSE respectively MicroOS. But the deployment is failed of course. I don't think the limits of my account at Hetzner is a problem right now. Unfortunately I cannot contact support at Hetzner to raise the limits, because my account too fresh. ;) |
Beta Was this translation helpful? Give feedback.
-
It is my fault, I am so sorry. 🙈 I have a different ssh default port in my ssh config, so the ssh commands from my machine didn't work properly – the remote host therefore couldn't be reached. After setting the standard ssh port back to 22 everything works now as expected. 👍 Maybe this is something to consider. Adding the port to the ssh command from the config file or enforcing it to port 22 would remove this pitfall in general. Thank you all for the great project and contributions by the way. |
Beta Was this translation helpful? Give feedback.
-
@tbabut So if I understood correctly, setting ssh_port to any value other than 22 causes the deploy to fail? |
Beta Was this translation helpful? Give feedback.
-
Not quite right. I hadn't changed the ssh_port in the kube.tf file. I have had set a different standard port in my ssh config:
During the deployment the following command is one of the steps, that have failed:
The ssh command from above is including my ssh config (~/.ssh/config) with my changed port for all hosts while the ssh daemon in the rescue system is listening to the default port 22. After removing the port 9962 from my config the next deployment ran like a breeze. So I guess the ssh port 22 should be hardcoded for using the rescue system at least, because the user might have a different default setting in his config. |
Beta Was this translation helpful? Give feedback.
Not quite right. I hadn't changed the ssh_port in the kube.tf file. I have had set a different standard port in my ssh config:
During the deployment the following command is one of the steps, that have failed:
The ssh command from above is including my ssh config (~/.ssh/config) with my changed port for a…