Revert "[core] Support publishing events from aggregator to gcs (#557… #59911

Sparks0219 · 2026-01-07T02:03:32Z

This reverts commit 7198193.

Thank you for contributing to Ray! 🚀
Please review the Ray Contribution Guide before opening a pull request.

⚠️ Remove these instructions before submitting your PR.

💡 Tip: Mark as draft if you want early feedback, or ready for review when it's complete.

Description

Briefly describe what this PR accomplishes and why it's needed.

Related issues

Link related issues: "Fixes #1234", "Closes #1234", or "Related to #1234".

Additional information

Optional: Add implementation details, API changes, usage examples, screenshots, etc.

gemini-code-assist

Code Review

This pull request reverts the functionality for publishing events from the aggregator to GCS. The changes primarily involve removing the GCS publisher client, related configurations, and tests.

Alongside the revert, there's a beneficial refactoring of the event filtering logic. Previously, filtering was handled within each publisher client. This has been centralized into the AggregatorAgent, which now passes a filter function to the AsyncHttpPublisherClient. This improves modularity and removes code duplication.

The code removal appears to be clean and complete. I've identified a bug in one of the updated tests and provided a suggestion for a fix.

gemini-code-assist · 2026-01-07T02:05:50Z

python/ray/tests/test_metrics_agent.py

+    def test_case_publisher_specific_metrics_correct(publisher_name: str):
        fetch_prometheus_timeseries(prom_addresses, timeseries)
        metric_samples = timeseries.metric_samples.values()
+        expected_metrics_values = {
+            "ray_aggregator_agent_published_events_total": 1.0,
+            "ray_aggregator_agent_filtered_events_total": 1.0,
+            "ray_aggregator_agent_queue_dropped_events_total": 1.0,
+        }
        for descriptor, expected_value in expected_metrics_values.items():
-            samples = [
-                m
-                for m in metric_samples
-                if m.name == descriptor and m.labels[CONSUMER_TAG_KEY] == consumer_name
-            ]
+            samples = [m for m in metric_samples if m.name == descriptor]
            if not samples:
                return False
-            if samples[0].value != expected_value:
+            if (
+                samples[0].value != expected_value
+                or samples[0].labels[CONSUMER_TAG_KEY] != publisher_name
+            ):
                return False
        return True


This test function has a bug. It attempts to check for the CONSUMER_TAG_KEY on all metrics, including ray_aggregator_agent_queue_dropped_events_total, which is a global metric and does not have this tag. This will cause a KeyError.

To fix this, ray_aggregator_agent_queue_dropped_events_total should be removed from this test and verified separately with other global metrics. Additionally, the logic for finding samples should be more robust by filtering on the consumer tag directly and using .get() to avoid errors.

Suggested change

def test_case_publisher_specific_metrics_correct(publisher_name: str):

fetch_prometheus_timeseries(prom_addresses, timeseries)

metric_samples = timeseries.metric_samples.values()

expected_metrics_values = {

"ray_aggregator_agent_published_events_total": 1.0,

"ray_aggregator_agent_filtered_events_total": 1.0,

"ray_aggregator_agent_queue_dropped_events_total": 1.0,

}

for descriptor, expected_value in expected_metrics_values.items():

samples = [

m

for m in metric_samples

if m.name == descriptor and m.labels[CONSUMER_TAG_KEY] == consumer_name

]

samples = [m for m in metric_samples if m.name == descriptor]

if not samples:

return False

if samples[0].value != expected_value:

if (

samples[0].value != expected_value

or samples[0].labels[CONSUMER_TAG_KEY] != publisher_name

):

return False

return True

def test_case_publisher_specific_metrics_correct(publisher_name: str):

fetch_prometheus_timeseries(prom_addresses, timeseries)

metric_samples = timeseries.metric_samples.values()

expected_metrics_values = {

"ray_aggregator_agent_published_events_total": 1.0,

"ray_aggregator_agent_filtered_events_total": 1.0,

}

for descriptor, expected_value in expected_metrics_values.items():

samples = [

m

for m in metric_samples

if m.name == descriptor

and m.labels.get(CONSUMER_TAG_KEY) == publisher_name

]

if not samples:

return False

if samples[0].value != expected_value:

return False

return True

…project#55781)" This reverts commit 7198193. Signed-off-by: joshlee <joshlee@anyscale.com>

Sparks0219 · 2026-01-08T18:57:04Z

Fixed by #59965

gemini-code-assist bot reviewed Jan 7, 2026

View reviewed changes

Revert "[core] Support publishing events from aggregator to gcs (ray-…

c36e18f

…project#55781)" This reverts commit 7198193. Signed-off-by: joshlee <joshlee@anyscale.com>

Sparks0219 force-pushed the joshlee/find-memleak-in-test-scheduling branch from 7f8e30a to c36e18f Compare January 7, 2026 21:33

Sparks0219 marked this pull request as ready for review January 7, 2026 21:34

Sparks0219 requested a review from a team as a code owner January 7, 2026 21:34

Sparks0219 added the go add ONLY when ready to merge, run all tests label Jan 7, 2026

edoakes approved these changes Jan 7, 2026

View reviewed changes

edoakes enabled auto-merge (squash) January 7, 2026 22:52

ray-gardener bot added the core Issues that should be addressed in Ray Core label Jan 8, 2026

Sparks0219 closed this Jan 8, 2026

auto-merge was automatically disabled January 8, 2026 18:57
Pull request was closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revert "[core] Support publishing events from aggregator to gcs (#557… #59911

Revert "[core] Support publishing events from aggregator to gcs (#557… #59911

Uh oh!

Sparks0219 commented Jan 7, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 7, 2026

Uh oh!

Sparks0219 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Revert "[core] Support publishing events from aggregator to gcs (#557… #59911

Revert "[core] Support publishing events from aggregator to gcs (#557… #59911

Uh oh!

Conversation

Sparks0219 commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Sparks0219 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sparks0219 commented Jan 7, 2026 •

edited

Loading