Add plugin health monitoring to /health endpoint #1884
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Health Endpoint Improvement - Complete ✅
Summary
Implemented comprehensive health checking for plugin processes in the
/healthendpoint to detect when plugins exit (e.g., due to OOM), as requested in the issue.Changes Made
1. Plugin Client Tracking (
cmd/plugins.go)plugin.Clientinstances alongside dispensed plugin interfaces2. Agent Integration (
dkron/agent.go,dkron/options.go,cmd/agent.go)3. Health Endpoint Enhancement (
dkron/api.go)client.Exited()4. Testing (
dkron/api_test.go)API Behavior
Healthy:
HTTP 200 OK-{"status":"healthy","leader":true}Unhealthy:
HTTP 503 Service Unavailable-{"status":"unhealthy","issues":["plugin X has exited"],"leader":true}Recent Updates
Testing & Validation
✅ Build: Successful compilation after merge
✅ Manual Test: Verified both healthy and unhealthy states
✅ Unit Test: Added TestHealthEndpoint
✅ Code Review: Addressed all review comments
✅ Security Scan: 0 vulnerabilities (CodeQL)
Impact
Addresses Issue Requirements
✅ Health endpoint checks all loaded plugins are running
✅ Returns non-200 status code when unhealthy
✅ Provides actionable health information
✅ Includes cluster health (leader status)
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.