You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After updating Rancher from 2.7.5 to 2.8.5, some imported RKE2 clusters are displayed as offline in the Rancher UI. Upon investigation, the issue seems related to the Fleet-Agent remaining stuck in a "bootstrap" state. Specifically, the Fleet-Agent continues to use the fleet-agent-bootstrap secret and fails to generate the fleet-agent secret. This issue only occurs for certain clusters, while others work as expected.
Expected Behavior
The Fleet-Agent should transition from using the fleet-agent-bootstrap secret to creating and using the fleet-agent secret, completing the registration process.
Steps To Reproduce
Update Rancher Management Server from version 2.7.5 to 2.8.5.
Ensure there are imported downstream clusters (e.g., v1.26.16+rke2r1).
Check the logs of the fleet-agent on an affected cluster:
time="2025-01-27T07:48:46Z" level=error msg="Failed to register agent: registration failed: cannot create clusterregistration on management cluster for cluster id 'some_random_id': Unauthorized"
time="2025-01-27T07:49:46Z" level=warning msg="Cannot find fleet-agent secret, running registration"
time="2025-01-27T07:49:46Z" level=info msg="Creating clusterregistration with id 'some_random_id' for new token"
Compare the cattle-fleet-system namespace secrets:
Working clusters have a fleet-agent secret.
Affected clusters only have a fleet-agent-bootstrap secret.
<details> <summary>cattle-fleet-system logs</summary>
time="2025-01-27T09:12:49Z" level=error msg="Failed to register agent: registration failed: cannot create clusterregistration on management cluster for cluster id '66t7lwf9r6gpwljrk5swl9hdxgt5bkxc756nvl7hbtd75r2zfrgm9v': Unauthorized"
time="2025-01-27T09:13:49Z" level=warning msg="Cannot find fleet-agent secret, running registration"
time="2025-01-27T09:13:49Z" level=info msg="Creating clusterregistration with id '66t7lwf9r6gpwljrk5swl9hdxgt5bkxc756nvl7hbtd75r2zfrgm9v' for new token"
</details>
<details> <summary>cattle-cluster-agent logs from a cluster that lost connection to rancher</summary>
kubectl -n cattle-system logs deployments/cattle-cluster-agent
Found 2 pods, using pod/cattle-cluster-agent-984568b5-cpsh7
Error: --namespace or env NAMESPACE is required to be set
Usage:
fleet-agent [flags]
Flags:
--agent-scope string An identifier used to scope the agent bundleID names, typically the same as namespace
--checkin-interval string How often to post cluster status
--debug Turn on debug logging
--debug-level int If debugging is enabled, set klog -v=X
-h, --help help for fleet-agent
--kubeconfig string kubeconfig file
--namespace string namespace to watch
-v, --version version for fleet-agent
time="2025-01-27T07:12:36Z" level=fatal msg="--namespace or env NAMESPACE is required to be set"
</details>
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Is there an existing issue for this?
Current Behavior
After updating Rancher from 2.7.5 to 2.8.5, some imported RKE2 clusters are displayed as offline in the Rancher UI. Upon investigation, the issue seems related to the Fleet-Agent remaining stuck in a "bootstrap" state. Specifically, the Fleet-Agent continues to use the fleet-agent-bootstrap secret and fails to generate the fleet-agent secret. This issue only occurs for certain clusters, while others work as expected.
Expected Behavior
The Fleet-Agent should transition from using the fleet-agent-bootstrap secret to creating and using the fleet-agent secret, completing the registration process.
Steps To Reproduce
Update Rancher Management Server from version 2.7.5 to 2.8.5.
Ensure there are imported downstream clusters (e.g., v1.26.16+rke2r1).
Check the logs of the fleet-agent on an affected cluster:
Example error logs:
Compare the cattle-fleet-system namespace secrets:
Environment
Logs
Anything else?
No response
The text was updated successfully, but these errors were encountered: