You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
rollout restart pods in a 3 units cluster (it's not 100% reproducible, but happen often enough)
Expected behavior
Secondary should join the cluster back
Actual behavior
Secondary is not joining back, considered as offline
Versions
Operating system:
Juju CLI:
Juju agent:
Charm revision: 127
microk8s: MicroK8s v1.28.7 revision 6532
Log output
2024-05-16T13:45:18.111Z [container-agent] 2024-05-16 13:45:18 INFO juju-log Unit workload member-state is offline with member-role unknown
2024-05-16T13:45:21.896Z [container-agent] 2024-05-16 13:45:21 ERROR juju-log Failed to get cluster status for cluster-ab0e762c137dc447d08ce68b19fb20b3
2024-05-16T13:45:21.903Z [container-agent] 2024-05-16 13:45:21 ERROR juju-log Failed to get cluster endpoints
2024-05-16T13:45:21.903Z [container-agent] Traceback (most recent call last):
2024-05-16T13:45:21.903Z [container-agent] File "/var/lib/juju/agents/unit-heat-mysql-0/charm/src/mysql_k8s_helpers.py", line 836, in update_endpoints
2024-05-16T13:45:21.903Z [container-agent] rw_endpoints, ro_endpoints, offline = self.get_cluster_endpoints(get_ips=False)
2024-05-16T13:45:21.903Z [container-agent] File "/var/lib/juju/agents/unit-heat-mysql-0/charm/lib/charms/mysql/v0/mysql.py", line 1469, in get_cluster_endpoints
2024-05-16T13:45:21.903Z [container-agent] raise MySQLGetClusterEndpointsError("Failed to get endpoints from cluster status")
2024-05-16T13:45:21.903Z [container-agent] charms.mysql.v0.mysql.MySQLGetClusterEndpointsError: Failed to get endpoints from cluster status
2024-05-16T13:45:22.191Z [container-agent] 2024-05-16 13:45:22 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)
2024-05-16T13:47:53.387910Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 3306'
2024-05-16T13:48:00.275796Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 3306'
2024-05-16T13:48:00.385285Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 3306'
2024-05-16T13:48:07.654156Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] Error connecting to all peers. Member join failed. Local port: 3306'
2024-05-16T13:48:07.767533Z 0 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 3306'
2024-05-16T13:48:08.469058Z 28247 [ERROR] [MY-011640] [Repl] Plugin group_replication reported: 'Timeout on wait for view after joining group'
2024-05-16T13:48:08.469343Z 28247 [ERROR] [MY-011735] [Repl] Plugin group_replication reported: '[GCS] The member is already leaving or joining a group.'
Additional context
After a debugging session with @paulomach, we got the instance to successfully join back using: c.rejoin_instance("heat-mysql-0.heat-mysql-endpoints.openstack.svc.cluster.local:3306")
The command was performed from the failed unit to the primary unit (ruling out connection issue)
The text was updated successfully, but these errors were encountered:
Steps to reproduce
Expected behavior
Secondary should join the cluster back
Actual behavior
Secondary is not joining back, considered as offline
Versions
Operating system:
Juju CLI:
Juju agent:
Charm revision: 127
microk8s: MicroK8s v1.28.7 revision 6532
Log output
Additional context
After a debugging session with @paulomach, we got the instance to successfully join back using:
c.rejoin_instance("heat-mysql-0.heat-mysql-endpoints.openstack.svc.cluster.local:3306")
The command was performed from the failed unit to the primary unit (ruling out connection issue)
The text was updated successfully, but these errors were encountered: