-
Notifications
You must be signed in to change notification settings - Fork 406
fix: Export job execution metrics to Prometheus #1885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: Export job execution metrics to Prometheus #1885
Conversation
WalkthroughIntroduces Prometheus metrics for job execution outcomes by defining two new CounterVec metrics (succeeded and failed totals) and integrating them into the SetExecutionDone function to track execution results alongside existing hashicorp/go-metrics instrumentation. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (2)**/*.go📄 CodeRabbit inference engine (AGENTS.md)
Files:
dkron/**/*.go📄 CodeRabbit inference engine (AGENTS.md)
Files:
🧠 Learnings (4)📚 Learning: 2025-12-20T09:34:05.813ZApplied to files:
📚 Learning: 2025-12-20T09:34:05.813ZApplied to files:
📚 Learning: 2025-12-20T09:34:05.813ZApplied to files:
📚 Learning: 2025-12-20T09:34:05.813ZApplied to files:
🧬 Code graph analysis (2)dkron/job_metrics.go (2)
dkron/store.go (1)
🔇 Additional comments (3)
Comment |
|
@vcastellm Please review the PR, as the metrics are still not functioning |
Fix: Export job execution metrics to Prometheus
Problem
In PR #1825, job execution metrics were added using
github.com/armon/go-metricspackage:However, these metrics are not exported to Prometheus even though they are being created.
Root Cause
The issue is an architectural mismatch between two different metrics systems:
hashicorp/go-metrics(aliased asgithub.com/armon/go-metrics) - Creates metrics in its own internal sinkprometheus/client_golang- Exports metrics from the global Prometheus registryThe HTTP endpoint in
dkron/api.gousespromhttp.Handler()fromprometheus/client_golang:This handler exports metrics from the global Prometheus registry, but metrics created via
go-metricsdon't register in this registry. They go to a separatePrometheusSinkfromhashicorp/go-metrics/prometheus, which has its own HTTP handler that is never used in the API endpoint.Why PR #1825 Didn't Work
Looking at the commit history of PR #1825, there were several iterations:
The switch to
go-metricswas likely made to:However, the integration between
go-metricsPrometheus sink andpromhttp.Handler()was not properly implemented.Solution
This PR adds direct Prometheus metrics using
prometheus/client_golang, similar to how it's done inplugin/shell/prometheus.go:Created
dkron/job_metrics.gowith Prometheus metric definitions:dkron_job_executions_succeeded_total{job_name="..."}dkron_job_executions_failed_total{job_name="..."}Updated
dkron/store.goto emit both:go-metricsmetrics (for statsd compatibility)prometheus/client_golangmetrics (for Prometheus export)This ensures:
promhttp.Handler()Testing
The fix has been tested locally:
/metricsendpointExample output:
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.