Skip to content

Conversation

@korgan00
Copy link
Contributor

@korgan00 korgan00 commented Dec 1, 2025

Summary

  1. Modify free_resources command to be able to download de logs from Ray and store them into the COS.
  2. Modify Get endpoints to get the logs from COS > Ray > Database and filter them for partners or users.

Details and comments

Upload logs file from free_resources command

The idea is that, at the moment we cleanup the resources consumed by a Function we also upload to the COS the logs from the database. The place would be here.

Some things to take into account for the creation of these files:

  • For non-provider Function we are going to create one unique file with all the data
  • For provider Functions we need to create two files: user and provider log files (this is being managed by the LogsStorage right now).
  • User log file: will store every trace that starts with [PUBLIC]
  • Provider log file: will store the entire logs (by now also the [PUBLIC] ones, pending to confirm).
  • In the database we will reset the field to None.

GET end-points:

We will need to modify a bit the logic in these end-points. If the Job is in a final status we are going to return the content of each file:

  • GET /logs: will return the user logs
  • GET /provider_logs: will return the provider logs

If the Job is in another state we will return the logs from Ray directly and we will filter by [PUBLIC] in case the end-point would be /logs.

@korgan00 korgan00 requested a review from a team as a code owner December 1, 2025 11:08
@avilches
Copy link
Contributor

avilches commented Dec 2, 2025

After the meeting to review the PR, these are the agreed points:

  • For users, logs without a prefix are ignored (there is no need to handle multiline).
  • The logs field will be made nullable and marked as null when uploaded to COS.
  • Calling check_logs to trim the logs in the /logs endpoint is ok.
  • To get the logs, it's ok to check COS first, then Ray as you was doing. But it's necessary to check job.compute_resource.active

In another PR:

  • The complete path of the files in the COS must be saved in the database. A table with a 1-N relationship with jobs is suggested to store the path of the public file, the provider file, and others that are uploaded to COS later (such as results, etc.).

Copy link
Collaborator

@ElePT ElePT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, I think that it would be good to also start documenting more internal-facing changes in the release notes, as we started doing with interface changes. I left some pointers in another PR on how to do this: #1788 (comment). If you don't agree, I am open for discussion on what should and shouldn't be part of the renos.


if job_handler:
logs = job_handler.logs(job.ray_job_id)
job.logs = check_logs(logs, job)
Copy link
Contributor

@avilches avilches Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch @korgan00 ! If we trim the logs, how can we will know later if the no_resources_log message is in the logs? :)

@korgan00 korgan00 merged commit ba8b469 into feat/logs-refactor Dec 10, 2025
10 of 12 checks passed
@korgan00 korgan00 deleted the store-logs-on-free-resources branch December 10, 2025 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants