Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stacked 4/5] metrics: add policy system collector. #405

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Commits on Nov 14, 2024

  1. metrics: rework metrics collector registry.

    Rework our metrics collector registry to take care of most
    of the necessary bits for metrics registration, collection
    and gathering. Use the prometheus-provided namespacing and
    subsystems to put all generated metrics under a prefix and
    provide additional grouping.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    a5c5ec0 View commit details
    Browse the repository at this point in the history
  2. metrics: update cgroupstats collector.

    Update cgroupstats collector for the reworked metrics registry.
    Split out automatic registration to a register subpackage.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    f78d6ba View commit details
    Browse the repository at this point in the history
  3. resmgr: remove old scattered bits of metrics polling.

    Remove the old resmgr-triggered polling of policy metrics
    and the old resmgr-level polling policy metrics collector.
    Implement policy metrics collection in the policy package
    itself.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    edcee16 View commit details
    Browse the repository at this point in the history
  4. instrumentation: remove opencensus metrics exporter.

    Remove the old opencensus-based prometheus exporter. Rework
    prometheus exporting using our update metrics registry and
    a promhttp /metrics-handler.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    477a97c View commit details
    Browse the repository at this point in the history
  5. config: expose metrics configuration.

    Add configuration bits for controlling which metrics are
    collected. Enable collection of policy metrics by default.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    4571258 View commit details
    Browse the repository at this point in the history
  6. resmgr: warn about obsolete command line argument.

    Remove obsolete and unused option entries. Give a warning about
    using the now-obsolete '-metrics-interval' argument. It's used
    unconditionally by our existing Helm charts, so we'll phase it
    out a bit more gently.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    5fe84ba View commit details
    Browse the repository at this point in the history
  7. metrics: add standard collectors.

    Add a metrics/collectors subpackage. When imported it pulls
    in and registers the fairly standard buildinfo, process and
    golang runtime collectors. Turn on the build info collector
    by default.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    c55507e View commit details
    Browse the repository at this point in the history
  8. policy: rework policy/backend metrics interface.

    Simplify the policy-backend metrics collection interface,
    reducing it to a single GetMetrics() call and a returned
    Metrics interface which simply implements the collector-
    like Describe() and Collect() interfaces. Update policy
    implementations accordingly.
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    185b1e3 View commit details
    Browse the repository at this point in the history
  9. policy: implement policy system metrics.

    Implement collection of policy 'system' prometheus metrics.
    
    We collect per each memory node
      - memory capcity
      - memory usage
      - number of containers sharing the node
    
    We collect per each CPU core
      - allocation from that core
      - number of containers sharing the core
    
    Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
    klihub committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    2654f63 View commit details
    Browse the repository at this point in the history