Skip to content

Conversation

@machichima
Copy link
Collaborator

@machichima machichima commented Oct 30, 2025

Why are these changes needed?

Support cron job scheduling. Following this design docs and implement milestone 1 in this PR

Main changes:

  • Add RayCronJob CRD and controller
  • Add feature gate for enabling RayCronJob
  • Add unit test

Test

Apply the sample YAML ray-operator/config/samples/ray-cronjob.sample.yaml. RayJobs are being scheduled every minute:

image

Trigger validation error

image

Related issue number

Following comment: #2426 (comment)

Closes #2426

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@machichima machichima changed the title [POC] Add Ray Cron Job [Feat] Add Ray Cron Job Nov 24, 2025
@machichima machichima marked this pull request as ready for review November 24, 2025 13:03
@CheyuWu CheyuWu self-requested a review November 24, 2025 17:01
Comment on lines 148 to 160
func constructRayJob(cronJob *rayv1.RayCronJob) *rayv1.RayJob {
rayJob := &rayv1.RayJob{
ObjectMeta: metav1.ObjectMeta{
Name: fmt.Sprintf("%s-%s", cronJob.Name, rand.String(5)),
Namespace: cronJob.Namespace,
Labels: map[string]string{
"ray.io/cronjob-name": cronJob.Name,
},
},
Spec: *cronJob.Spec.JobTemplate.DeepCopy(),
}
return rayJob
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should set an OwnerReference in order to do the garbage collection by k8s.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Just added. Thank you for pointing this out!

Comment on lines +38 to +44
// RayCronJob is the Schema for the raycronjobs API
type RayCronJob struct {
metav1.TypeMeta `json:",inline"`
Spec RayCronJobSpec `json:"spec,omitempty"`
Status RayCronJobStatus `json:"status,omitempty"`
metav1.ObjectMeta `json:"metadata,omitempty"`
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep TypeMeta and ObjectMeta at the top, with Spec and Status grouped below.

  • type RayCluster struct {
    // Standard object metadata.
    metav1.TypeMeta `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`
    // Specification of the desired behavior of the RayCluster.
    Spec RayClusterSpec `json:"spec,omitempty"`
    // +optional
    Status RayClusterStatus `json:"status,omitempty"`
    }
  • type RayService struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`
    Spec RayServiceSpec `json:"spec,omitempty"`
    // +optional
    Status RayServiceStatuses `json:"status,omitempty"`
    }

Comment on lines +18 to +25
// The overall state of the RayCronJob.
type ScheduleStatus string

const (
StatusNew ScheduleStatus = ""
StatusScheduled ScheduleStatus = "Scheduled"
StatusValidationFailed ScheduleStatus = "ValidationFailed"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven’t seen ScheduleStatus being used anywhere yet. Is this something that will be added later?

Comment on lines +11 to +16
type RayCronJobSpec struct {
// JobTemplate defines the job spec that will be created by cron scheduling
JobTemplate *RayJobSpec `json:"jobTemplate"`
// Schedule is the cron schedule string
Schedule string `json:"schedule"`
}
Copy link
Collaborator

@win5923 win5923 Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is alright, but I think we should introduce a separate struct (e.g. RayJobTemplateSpec) to hold both the metadata and the spec for the generated RayJob, similar to how Kubernetes models JobTemplateSpec in CronJob.

This would allow users to specify metadata inside jobTemplate, which we can then propagate to the created RayJob. It also keeps the API aligned with common Kubernetes patterns.

WDYT?

Reference:
https://github.com/kubernetes/kubernetes/blob/af9fb799ef09bbdb0b2b40b4e441f2ffccaffe18/pkg/apis/batch/types.go#L94-L105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support cron scheduling for RayJob

2 participants