Skip to content

Provide a quick view that shows the health of a load balancer in a GKE app #335

@BSick7

Description

@BSick7

Recently, we attached a load balancer to a GKE app.

There are several ways that it could fail and it was difficult to track down:

  • Is the gateway launched/healthy?
  • Is the SSL provisioned properly?
  • Is the health check correct?
  • Is the port correct?

There is information that sprawls several gcp and k8s to diagnose. Here is a checklist:

  1. kubectl describe gateway + kubectl describe httproute (Accepted/Programmed/ResolvedRefs)
  2. kubectl get endpointslice (does the Service actually have endpoints?)
  3. gcloud compute backend-services get-health … (why unhealthy)
  4. gcloud compute health-checks describe … (what it’s probing)
  5. gcloud compute network-endpoint-groups list-network-endpoints … (are endpoints registered?)

Ironically, I only found the solution by navigating to the pod in GCP console.

Some interesting notes:

  • If I click on the replica set, there is no useful information
  • If I click on the pod revision, there is no useful information -- it navigates to the replica set (wtf?)
  • I had to click on the new pod name to find the information under Events tab. (Revealed a readiness probe issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions