Skip to content

feat: add gRPC health check and reflection support to arbiter#21410

Open
scheibinger wants to merge 1 commit intodevelopfrom
cop-2037/arbiter-healthcheck
Open

feat: add gRPC health check and reflection support to arbiter#21410
scheibinger wants to merge 1 commit intodevelopfrom
cop-2037/arbiter-healthcheck

Conversation

@scheibinger
Copy link
Contributor

Add gRPC Health Check and Server Reflection to Arbiter Service

Problem

The arbiter gRPC server did not support the standard grpc.health.v1.Health service or server reflection, causing health check probes to fail:

grpcurl -plaintext cre-wf-0-0.nop-b.svc.cluster.local:9876 grpc.health.v1.Health/Check
Error invoking method "grpc.health.v1.Health/Check": failed to query for service descriptor "grpc.health.v1.Health": server does not support the reflection API

Neither the health check service nor reflection were registered on the gRPC server — they simply weren't implemented.

Solution

Registered the standard gRPC health check service and server reflection on the arbiter's gRPC server.

Changes in core/services/arbiter/arbiter.go:

  • Added imports for google.golang.org/grpc/health, google.golang.org/grpc/health/grpc_health_v1, and google.golang.org/grpc/reflection
  • Added healthServer *health.Server field to the arbiter struct
  • Registered grpc.health.v1.Health service via healthgrpc.RegisterHealthServer() — enables standard gRPC health check probes (used by Kubernetes, grpcurl, load balancers, etc.)
  • Registered gRPC server reflection via reflection.Register() — enables service discovery tools like grpcurl to introspect available services
  • Set serving status to SERVING in Start() after the service is fully initialized
  • Set serving status to NOT_SERVING in Close() before initiating graceful shutdown, ensuring in-flight health probes get an accurate response during draining

Risk: LOW

Additive change — registers two well-known gRPC services on an existing server. No changes to business logic, existing RPCs, or the arbiter's core behavior. All existing tests continue to pass.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

✅ No conflicts with other open PRs targeting develop

@github-actions
Copy link
Contributor

github-actions bot commented Mar 5, 2026

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@cl-sonarqube-production
Copy link

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Risk Rating: LOW. Adds standard gRPC health checking and server reflection to the Arbiter gRPC server to support Kubernetes probes and tooling like grpcurl.

Changes:

  • Register grpc.health.v1.Health on the Arbiter gRPC server and track serving status during Start/Close.
  • Register gRPC server reflection for service discovery/introspection.
  • Add a healthServer field to the arbiter service to manage status.

Comment on lines +81 to +82
// Register gRPC server reflection (enables grpcurl and other tools)
reflection.Register(grpcServer)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reflection.Register(grpcServer) enables server reflection unconditionally. Since this repo otherwise appears not to enable gRPC reflection anywhere else, consider gating this behind a config/env flag (or restricting it to non-production) to avoid exposing service/method metadata on any reachable gRPC port.

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +120
// Mark gRPC health as serving
a.healthServer.SetServingStatus("", healthgrpc.HealthCheckResponse_SERVING)

Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Health status is first set to SERVING after the gRPC Serve goroutine is launched. During the small window before SetServingStatus is called, the health service will typically respond with SERVICE_UNKNOWN, which can cause flaky readiness/health probes right after startup. Consider setting an explicit NOT_SERVING status before starting the gRPC server (e.g., in New or at the beginning of Start) and only switching to SERVING once startup is complete.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +82
// Register gRPC health check service
healthServer := health.NewServer()
healthgrpc.RegisterHealthServer(grpcServer, healthServer)

// Register gRPC server reflection (enables grpcurl and other tools)
reflection.Register(grpcServer)
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces new externally visible gRPC behaviors (health check + reflection registration) but there are existing tests for the arbiter service lifecycle in this package. Please add unit/integration tests that (1) call grpc.health.v1.Health/Check and assert SERVING/NOT_SERVING across Start/Close, and (2) verify reflection can list the registered services.

Copilot uses AI. Check for mistakes.
@trunk-io
Copy link

trunk-io bot commented Mar 5, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants