-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Description
Describe your feature request
Currently ray attach only allows opening an SSH session on the head node. It could be useful to allow attaching to worker nodes to check what state the execution environment and file system are in (e.g. running conda list, examining config files such as ~/.keras/keras.json).
Technically this also applies to ray exec, but in my experience the use cases are much less convincing.
Existing alternatives
ray list-worker-ips is subpar since it doesn't list the necessary SSH key location + it's tedious to type out a long ssh command every time.
A workaround is using awless ssh with an amazon instance ID, but this will open a raw ssh session while attach runs a screen or tmux; and awless does not work with Kubernetes and another autoscaler backends.
Suggested API
ray attach <cluster name> --ip <node ip> since ray prints the IP if a node has issues.
Alternative: ray attach <cluster name> --node-id <node id> since IPs are long and node IDs are very short.