-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Revert "[core] Support publishing events from aggregator to gcs (#557… #59911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "[core] Support publishing events from aggregator to gcs (#557… #59911
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request reverts the functionality for publishing events from the aggregator to GCS. The changes primarily involve removing the GCS publisher client, related configurations, and tests.
Alongside the revert, there's a beneficial refactoring of the event filtering logic. Previously, filtering was handled within each publisher client. This has been centralized into the AggregatorAgent, which now passes a filter function to the AsyncHttpPublisherClient. This improves modularity and removes code duplication.
The code removal appears to be clean and complete. I've identified a bug in one of the updated tests and provided a suggestion for a fix.
| def test_case_publisher_specific_metrics_correct(publisher_name: str): | ||
| fetch_prometheus_timeseries(prom_addresses, timeseries) | ||
| metric_samples = timeseries.metric_samples.values() | ||
| expected_metrics_values = { | ||
| "ray_aggregator_agent_published_events_total": 1.0, | ||
| "ray_aggregator_agent_filtered_events_total": 1.0, | ||
| "ray_aggregator_agent_queue_dropped_events_total": 1.0, | ||
| } | ||
| for descriptor, expected_value in expected_metrics_values.items(): | ||
| samples = [ | ||
| m | ||
| for m in metric_samples | ||
| if m.name == descriptor and m.labels[CONSUMER_TAG_KEY] == consumer_name | ||
| ] | ||
| samples = [m for m in metric_samples if m.name == descriptor] | ||
| if not samples: | ||
| return False | ||
| if samples[0].value != expected_value: | ||
| if ( | ||
| samples[0].value != expected_value | ||
| or samples[0].labels[CONSUMER_TAG_KEY] != publisher_name | ||
| ): | ||
| return False | ||
| return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test function has a bug. It attempts to check for the CONSUMER_TAG_KEY on all metrics, including ray_aggregator_agent_queue_dropped_events_total, which is a global metric and does not have this tag. This will cause a KeyError.
To fix this, ray_aggregator_agent_queue_dropped_events_total should be removed from this test and verified separately with other global metrics. Additionally, the logic for finding samples should be more robust by filtering on the consumer tag directly and using .get() to avoid errors.
| def test_case_publisher_specific_metrics_correct(publisher_name: str): | |
| fetch_prometheus_timeseries(prom_addresses, timeseries) | |
| metric_samples = timeseries.metric_samples.values() | |
| expected_metrics_values = { | |
| "ray_aggregator_agent_published_events_total": 1.0, | |
| "ray_aggregator_agent_filtered_events_total": 1.0, | |
| "ray_aggregator_agent_queue_dropped_events_total": 1.0, | |
| } | |
| for descriptor, expected_value in expected_metrics_values.items(): | |
| samples = [ | |
| m | |
| for m in metric_samples | |
| if m.name == descriptor and m.labels[CONSUMER_TAG_KEY] == consumer_name | |
| ] | |
| samples = [m for m in metric_samples if m.name == descriptor] | |
| if not samples: | |
| return False | |
| if samples[0].value != expected_value: | |
| if ( | |
| samples[0].value != expected_value | |
| or samples[0].labels[CONSUMER_TAG_KEY] != publisher_name | |
| ): | |
| return False | |
| return True | |
| def test_case_publisher_specific_metrics_correct(publisher_name: str): | |
| fetch_prometheus_timeseries(prom_addresses, timeseries) | |
| metric_samples = timeseries.metric_samples.values() | |
| expected_metrics_values = { | |
| "ray_aggregator_agent_published_events_total": 1.0, | |
| "ray_aggregator_agent_filtered_events_total": 1.0, | |
| } | |
| for descriptor, expected_value in expected_metrics_values.items(): | |
| samples = [ | |
| m | |
| for m in metric_samples | |
| if m.name == descriptor | |
| and m.labels.get(CONSUMER_TAG_KEY) == publisher_name | |
| ] | |
| if not samples: | |
| return False | |
| if samples[0].value != expected_value: | |
| return False | |
| return True |
…project#55781)" This reverts commit 7198193. Signed-off-by: joshlee <joshlee@anyscale.com>
7f8e30a to
c36e18f
Compare
|
Fixed by #59965 |
Pull request was closed
This reverts commit 7198193.
Description
Related issues
Additional information