Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus Support for Metrics logging. #148

Open
briancoutinho opened this issue Jun 14, 2023 · 0 comments
Open

Prometheus Support for Metrics logging. #148

briancoutinho opened this issue Jun 14, 2023 · 0 comments

Comments

@briancoutinho
Copy link
Contributor

TLDR

Dynolog provides system telemetry at Meta as well as in open source environments. Metric logging using Prometheus - an industry standard framework for logging/exporting metrics. This can also be leveraged by Meta AI Research super cluster and other open source infra based clusters.

Prometheus

Prometheus is an open source tool for metrics collection and publishing. One can use it to monitor metics remotely, graph them as well as integrate with Grafana for visualization.

  • A core concept in Prometheus is its data model. It consists of labels - a list of attributes of entities to associate with the metric (ex “ {nodename, gpu id}”), and metrics - numerical values that represent points in a time series..
  • Prometheus server runs on the box or node. Typically, it uses a pull model, obtaining the latest values of metrics and labels. (Visualized in diagram above)

Implementation

We can leverage the library https://github.com/jupp0r/prometheus-cpp/ that is straightforward to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant