Skip to content

Conversation

MiniSho
Copy link
Contributor

@MiniSho MiniSho commented Sep 3, 2025

What type of PR is this?
Bug fix

What this PR does / why we need it:
This PR fixes an issue where the liveness and readiness probes in Ray pods were hardcoded to use default ports, ignoring custom port configurations specified in rayStartParams. This caused probe failures when users customized ports like dashboard-port or dashboard-agent-listen-port.

Which issue(s) this PR fixes:
Fixes health probe failures when custom ports are configured in rayStartParams.

Special notes for your reviewer:
The fix extracts port values from rayStartParams with fallback to default values, ensuring probes use the correct ports that Ray processes are actually listening on.

@MiniSho
Copy link
Contributor Author

MiniSho commented Sep 3, 2025

@kevin85421 PTAL~

Copy link
Member

@Future-Outlier Future-Outlier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me add an integration test to prove it?
And also provide a screenshot and a yaml file to show me how you test it, thank you!

Copy link
Collaborator

@win5923 win5923 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with the following YAML, and it worked well. Thanks!

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: raycluster-kuberay
spec:
  rayVersion: '2.46.0'
  headGroupSpec:
    rayStartParams:
      dashboard-port: "8266"
      dashboard-agent-listen-port: "8267"
    template:
      spec:
        containers:
        - name: ray-head
          image: rayproject/ray:2.46.0
          resources:
            limits:
              cpu: 1
              memory: 2G
            requests:
              cpu: 1
              memory: 2G
          ports:
          - containerPort: 6379
            name: gcs-server
          - containerPort: 8266 # Ray dashboard
            name: dashboard
          - containerPort: 10001
            name: client
螢幕擷取畫面 2025-09-09 232036

@win5923
Copy link
Collaborator

win5923 commented Sep 9, 2025

In v1.5.0 we might modify the liveness and readiness probes when we get rid of wget, so I’m not sure if this PR is still needed.

#3837

Co-authored-by: Jun-Hao Wan <ken89@kimo.com>
Signed-off-by: Itami Sho <42286868+MiniSho@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants