no known leader error forever #629

koh-osug · 2024-03-09T19:48:26Z

koh-osug
Mar 9, 2024

I started 3 nodes and stopped them later. Now I'm starting again only the first node but it displays forever:

2024-03-09T20:45:25+01:00 | WARN | attempt 3630: server 127.0.0.1:10001: no known leader
2024-03-09T20:45:25+01:00 | WARN | attempt 3630: server 127.0.0.1:10003: dial: dial tcp 127.0.0.1:10003: connect: connection refused

How to restore the functionality of this single node again? Is this an error which will now be displayed forever or is the node even dysfunctional?

How can this automatically healed? When the other nodes are also starting again, will this be also automatically handled?

Answered by cole-miller

Mar 9, 2024

The first node that you restarted---call it A---is still operating with a configuration that includes all three nodes. Since it can't contact the other two nodes it can't win an election and become leader. For A to resume operating on its own you have to force it to commit a new configuration that cuts out the other two nodes. Instructions on how to do that are here (written from the MicroK8s perspective but should be generally applicable, assuming you're using go-dqlite). Note that this will lead to data loss if A didn't have the latest database state when you took the cluster offline!

When the other nodes are also starting again, will this be also automatically handled?

Yes, when the …

View full answer

cole-miller · 2024-03-09T21:25:25Z

cole-miller
Mar 9, 2024

The first node that you restarted---call it A---is still operating with a configuration that includes all three nodes. Since it can't contact the other two nodes it can't win an election and become leader. For A to resume operating on its own you have to force it to commit a new configuration that cuts out the other two nodes. Instructions on how to do that are here (written from the MicroK8s perspective but should be generally applicable, assuming you're using go-dqlite). Note that this will lead to data loss if A didn't have the latest database state when you took the cluster offline!

When the other nodes are also starting again, will this be also automatically handled?

Yes, when the other nodes are restarted, they will find each other using the three-node configuration and eventually one of them will be able to win an election.

0 replies

koh-osug · 2024-03-09T22:29:10Z

koh-osug
Mar 9, 2024
Author

Thank you for the quick response.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

no known leader error forever #629

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

no known leader error forever #629

koh-osug Mar 9, 2024

Replies: 2 comments

cole-miller Mar 9, 2024

koh-osug Mar 9, 2024 Author

koh-osug
Mar 9, 2024

cole-miller
Mar 9, 2024

koh-osug
Mar 9, 2024
Author