fix(native): Fix OS metrics to report cumulative values for AVG type #26517

lingbin · 2025-11-03T16:24:39Z

All 6 OS-related metrics were defined as AVG type but reported as
delta values, causing incorrect averaging and potential data loss
in Prometheus monitoring.

Changed metrics to report cumulative values since process start:

presto_cpp.os_user_cpu_time_micros
presto_cpp.os_system_cpu_time_micros
presto_cpp.os_num_soft_page_faults
presto_cpp.os_num_hard_page_faults
presto_cpp.os_num_voluntary_context_switches
presto_cpp.os_num_forced_context_switches

This ensures:

Alignment with other AVG metrics in the system (task counts,
cache sizes, etc.)
Proper rate calculations in monitoring systems and no data loss
regardless of scraping intervals

== NO RELEASE NOTE ==

sourcery-ai · 2025-11-03T16:24:46Z

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

This PR converts six OS-related metrics from reporting delta values to reporting cumulative values by eliminating subtraction of previous readings and removing obsolete state variables used for delta calculations.

Class diagram for updated PeriodicTaskManager OS metrics logic

classDiagram
class PeriodicTaskManager {
  -lastHttpClientNumConnectionsCreated_: int64_t
  +updateOperatingSystemStats()
  +addOperatingSystemStatsUpdateTask()
}

%% Removed attributes for OS metric deltas
%% lastUserCpuTimeUs_, lastSystemCpuTimeUs_, lastSoftPageFaults_, lastHardPageFaults_, lastVoluntaryContextSwitches_, lastForcedContextSwitches_ are no longer present

Flow diagram for OS metrics reporting change (delta to cumulative)

flowchart TD
    A["Collect OS metric (e.g., user CPU time)"] --> B["Report cumulative value since process start"]
    B --> C["RECORD_METRIC_VALUE(metric, cumulative_value)"]
    %% Previously: A --> D["Subtract previous value (delta)"] --> C
    %% Now: direct cumulative reporting

File-Level Changes

Change	Details	Files
Switch OS metrics reporting from delta to cumulative values	Removed subtraction of last recorded values when calling RECORD_METRIC_VALUE Updated RECORD_METRIC_VALUE calls to directly use current usage values for all six metrics	`presto_cpp/main/PeriodicTaskManager.cpp`
Remove unused state variables for tracking previous metric values	Deleted lastUserCpuTimeUs_, lastSystemCpuTimeUs_, lastSoftPageFaults_, lastHardPageFaults_, lastVoluntaryContextSwitches_, and lastForcedContextSwitches_ members	`presto_cpp/main/PeriodicTaskManager.h`

Assessment against linked issues

Issue	Objective	Addressed
#26516	Change all 6 OS-related AVG type metrics to report cumulative values instead of delta values.	✅
#26516	Ensure consistency of OS AVG type metrics with other AVG metrics in the system (i.e., all report cumulative values).	✅
#26516	Prevent data loss in Prometheus monitoring by reporting cumulative values for OS AVG type metrics.	✅

Possibly linked issues

[native] OS Metrics(AVG Type Counters) should not report delta values #26516: The PR changes 6 OS metrics from reporting delta to cumulative values, directly addressing the issue's problem of incorrect averaging and data loss.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

lingbin · 2025-11-04T03:19:19Z

@majetideepak Could you help review this PR? Thanks.

lingbin · 2025-11-05T13:27:49Z

@majetideepak @karteekmurthys @aditi-pandit Kindly ping. Could you please review this PR? This issue affects the accuracy of Prometheus monitoring metrics.

jaystarshot · 2025-11-05T15:36:30Z

I think we should also change the metric type to SUM? In my understanding AVG type should just report the current value.

lingbin · 2025-11-05T18:39:12Z

I think we should also change the metric type to SUM? In my understanding AVG type should just report the current value.

@jaystarshot Thanks for your reply. We might first need to clarify the meaning of "current value" here: should it be a "current delta value" or "current cumulative value"?

When we say "delta value" for metrics, it implicitly implies a "time span," such as a 5-second span or a 30-second span. It seems only the "cumulative value" corresponds to a "point in time," that is, the "current value" at a specific point in time.

For OS-counters, the getrusage() function returns the "current cumulative value" (not the "delta") accumulated since process startup for each metric.

Regarding whether to change it to a "SUM type"(SUM type reports "delta value"), I believe it can also resolve the data loss issue mentioned in the issue (#26516, because I previously submitted a PR #23622 to fix SUM type metric reporting, the reported value will be accumulated in PrometheusReporter). However, because the getrusage() function returns a "current cumulative value", if we change it to a SUM type (which requires reporting the delta), we would need to save the old value and then periodically calculate the difference. This seems a bit redundant compared to directly recording the result of getrusage(). What do you think? Looking forward to your further feedback.

jaystarshot · 2025-11-05T19:36:42Z

@lingbin The root cause of all this is that the velox metric type AVERAGE is unclear and there is no documentation on what that should represent.
I am also like you confused on whether it should be average from the last reported or just average since start. In prometheus presto reporting the average type is represeted as the last value received (check here

In our production we use metric type SUM and use (persecond) and other differential to get accurate view in grapahana directly.

Regarding this change if getrusage() function returns the "current cumulative value" (not the "delta") accumulated then for our prometheus reporting you can keep this change but be aware that it only reports the last value received.

i.e if the puller pulls every 5 sec, it will only pick the last value received which i think is acceptable.

aditi-pandit · 2025-11-05T23:54:41Z

@xiaoxmeng @amitkdutta Please can you comment. Its possible these metrics are monitored at Meta since Meng added them originally.

lingbin · 2025-11-06T07:18:54Z

@jaystarshot Thank you for your further explanation.

The root cause of all this is that the velox metric type AVERAGE is unclear and there is no documentation on what that should represent. I am also like you confused on whether it should be average from the last reported or just average since start.

I'm also looking forward to a specific and consistent explanation of how to use each metric type.

In our production we use metric type SUM and use (persecond) and other differential to get accurate view in grapahana directly.

Do you mean that in your production environment code, these AVG metrics (the six OS metrics mentioned here) have already been modified to SUM type?

lingbin · 2025-11-06T07:52:44Z

After careful consideration, I've realized that for metrics whose values semantically increase monotonically (like the six OS-related metrics here), in Prometheus's "Pull Model", because the interval for "pulling metrics" differs from the "push interval" of PeriodicStatsReporter, the metric ultimately stored in Prometheus MUST NOT be a delta value; otherwise, metric data will be lost or wrong. (Example see #26516 , #23622 (comment))

Firstly, from an implementation perspective, when reporting each metric, we only have two implementation methods:
1. Either directly report the "cumulative value" (also known as the "current cumulative value" or the "most recently received value"): corresponding to the current Velox AVG type;
2. Or report the delta value and then accumulate it within the prometheus-reporter(PrometheusStatsReporter): corresponding to the current Velox SUM type;
For the OS-metric here, I think both of the above methods can solve the current problem.

FYI: Velox's AVG and SUM types both correspond to Prometheus's Gauge types (https://prometheus.io/docs/concepts/metric_types/#gauge). For AVG type, each report uses a new value to overwrite the old value; For SUM type, each report will be accumulated into the old value
Secondly, after the Prometheus Server obtains the "cumulative value", it can be displayed(maybe Grafana) in two ways depending on the semantics of the metric:
1. Displaying the difference or rate: This can be done using Prometheus's rate() (delta per second, ) or increase() function (delta between two pull intervals). This is suitable for the six OS-related metrics mentioned here(their values semantically increase monotonically).
2. Displaying the "real-time value": This is suitable for metrics such as "driver-count," for example, "presto_cpp.num_on_thread_drivers".

Perhaps we should document both methods so that developers can choose between them based on their needs? If it's easier to obtain the "cumulative value," then use the AVG type. If it's easier to obtain the "difference," then use the SUM type. The only point to note is that, generally speaking, "calculating the difference" can be a bit tedious because it requires saving the old values.

Looking forward to everyone's guidance and suggestions for better practices, especially for usage already in production environments, thanks.

jaystarshot · 2025-11-06T16:45:55Z

No not this one, but we have changed some of which we do use. For cpu we currently just use our host metric system.
Ack, This change looks good to me, so i will wait for a day of two for any additional comments from reviewers before approving.

aditi-pandit

Lets get @amitkdutta or @xiaoxmeng approval before submission. Have pinged Amit.

From IBM we are okay with this change. But I would prefer Meta confirm as well.

amitkdutta

Looks good. Thanks @lingbin

aditi-pandit

Thanks.

lingbin · 2025-11-10T13:01:55Z

Already rebased to re-trigger CI.

All 6 OS-related metrics were defined as **AVG** type but reported as **delta values**, causing incorrect averaging and potential data loss in Prometheus monitoring. Changed metrics to report **cumulative values** since process start: - presto_cpp.os_user_cpu_time_micros - presto_cpp.os_system_cpu_time_micros - presto_cpp.os_num_soft_page_faults - presto_cpp.os_num_hard_page_faults - presto_cpp.os_num_voluntary_context_switches - presto_cpp.os_num_forced_context_switches This ensures: 1. Alignment with other AVG metrics in the system (task counts, cache sizes, etc.) 2. Proper rate calculations in monitoring systems and no data loss regardless of scraping intervals

lingbin requested review from a team as code owners November 3, 2025 16:24

sourcery-ai bot reviewed Nov 3, 2025

View reviewed changes

jaystarshot approved these changes Nov 7, 2025

View reviewed changes

aditi-pandit requested changes Nov 7, 2025

View reviewed changes

aditi-pandit requested review from amitkdutta and xiaoxmeng November 8, 2025 01:03

amitkdutta approved these changes Nov 8, 2025

View reviewed changes

aditi-pandit approved these changes Nov 8, 2025

View reviewed changes

lingbin force-pushed the native-fix-os-counters branch from 22ec2df to cdc9c30 Compare November 10, 2025 13:01

lingbin force-pushed the native-fix-os-counters branch from cdc9c30 to d05707b Compare November 11, 2025 02:43

fix(native): Fix OS metrics to report cumulative values for AVG type #26517

Are you sure you want to change the base?

fix(native): Fix OS metrics to report cumulative values for AVG type #26517

Conversation

lingbin commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai bot commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for updated PeriodicTaskManager OS metrics logic

Flow diagram for OS metrics reporting change (delta to cumulative)

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

lingbin commented Nov 4, 2025

Uh oh!

lingbin commented Nov 5, 2025

Uh oh!

jaystarshot commented Nov 5, 2025

Uh oh!

lingbin commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaystarshot commented Nov 5, 2025

Uh oh!

aditi-pandit commented Nov 5, 2025

Uh oh!

lingbin commented Nov 6, 2025

Uh oh!

lingbin commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jaystarshot commented Nov 6, 2025

Uh oh!

aditi-pandit left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amitkdutta left a comment

Choose a reason for hiding this comment

Uh oh!

aditi-pandit left a comment

Choose a reason for hiding this comment

Uh oh!

lingbin commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lingbin commented Nov 3, 2025 •

edited

Loading

sourcery-ai bot commented Nov 3, 2025 •

edited

Loading

lingbin commented Nov 5, 2025 •

edited

Loading

lingbin commented Nov 6, 2025 •

edited

Loading

aditi-pandit left a comment •

edited

Loading