Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stacked 4/5] metrics: add policy system collector. #405

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

klihub
Copy link
Collaborator

@klihub klihub commented Nov 11, 2024

Notes: This PR is stacked on top of #408.

Implement collection of policy 'system' prometheus metrics.
For each memory node collect
- memory capcity
- memory usage
- number of containers sharing the node
For each CPU core collect
- total container allocations from that core
- number of containers sharing the core

@klihub klihub force-pushed the metrics/policy-system-collector branch 2 times, most recently from f0856b2 to bfc293d Compare November 11, 2024 22:58
@klihub klihub changed the title [3/4] metrics: add policy system collector. [4/5] metrics: add policy system collector. Nov 11, 2024
Copy link

@pfl pfl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@klihub klihub force-pushed the metrics/policy-system-collector branch 4 times, most recently from 90e774c to ac2049d Compare November 13, 2024 09:44
@klihub klihub marked this pull request as ready for review November 13, 2024 13:29
@klihub klihub changed the title [4/5] metrics: add policy system collector. [stack: 4/5] metrics: add policy system collector. Nov 13, 2024
@klihub klihub changed the title [stack: 4/5] metrics: add policy system collector. [stacked 4/5] metrics: add policy system collector. Nov 13, 2024
@klihub klihub force-pushed the metrics/policy-system-collector branch from ac2049d to 0a3330d Compare November 13, 2024 14:25
Rework our metrics collector registry to take care most of
the necessary bits fo metrics registration, collection and
gathering.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Update cgroupstats collector for the reworked metrics registry.
Split out automatic registration to a register subpackage.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove the old resmgr-triggered polling of policy metrics
and the old resmgr-level polling policy metrics collector.
Implement policy metrics collection in the policy package
itself.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove the old opencensus-based prometheus exporter. Rework
prometheus exporting using our update metrics registry and
a promhttp /metrics-handler.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add configuration bits for controlling which metrics are
collected. Enable collection of policy metrics by default.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Remove obsolete and unused option entries. Give a warning about
using the now-obsolete '-metrics-interval' argument. It's used
unconditionally by our existing Helm charts, so we'll phase it
out a bit more gently.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Add a metrics/collectors subpackage. When imported it pulls
in and registers the fairly standard buildinfo, process and
golang runtime collectors. Turn on the build info collector
by default.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Simplify the policy-backend metrics collection interface,
reducing it to a single GetMetrics() call and a returned
Metrics interface which simply implements the collector-
like Describe() and Collect() interfaces. Update policy
implementations accordingly.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Implement collection of policy 'system' prometheus metrics.

We collect per each memory node
  - memory capcity
  - memory usage
  - number of containers sharing the node

We collect per each CPU core
  - allocation from that core
  - number of containers sharing the core

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
@klihub klihub force-pushed the metrics/policy-system-collector branch from 0a3330d to 02aa6d1 Compare November 13, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants