Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-23050: Validate CloudZero Metrics are available from Kube State Metrics #64

Merged
merged 7 commits into from
Dec 3, 2024

Conversation

bdrennz
Copy link
Contributor

@bdrennz bdrennz commented Nov 27, 2024

This PR modifies the kube_state_metrics_reachable post-start job to also validate that all CloudZero KMS metrics are available. A sample log output from the Job:

{
  "level": "info",
  "log_sequence": 6,
  "msg": "Using endpoint URL: http://192.168.5.249:8080/metrics",     <---- Internal IP discovered using the Service Endpoint 
  "op": "ksm",
  "time": "2024-11-25T08:23:44Z"
}
{
  "level": "info",
  "log_sequence": 7,
  "msg": "All required metrics found: [kube_pod_info kube_node_info]",                <---- Metric Check (subset for now)
  "op": "ksm",
 }
 
Part of the Cluster Status Report related to KMS produced by the Job: {\"name\":\"kube_state_metrics_reachable\",\"passing\":true}]}",

The log statements show that the job was able to discover the internal IP of the KSM service, find all required metrics, and mark the kube_state_metrics_reachable job as complete.

It's worth noting that the base branch is feature/cp-23050, not develop. The remaining work will be handled in a subsequent ticket. I hard coded some values temporarily to prove out the functionality until we can get around to the Chart changes in https://cloudzero.atlassian.net/browse/CP-23740.

@bdrennz bdrennz changed the title CP-23050 (part 2) CP-23050: Validate CloudZero Metrics are available from Kube State Metrics Nov 27, 2024
@bdrennz bdrennz changed the base branch from develop to feature/cp-23050 November 27, 2024 20:44
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
@bdrennz bdrennz marked this pull request as ready for review November 27, 2024 20:50
@bdrennz bdrennz requested a review from a team as a code owner November 27, 2024 20:50
Copy link
Contributor

@dmepham dmepham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! I have some clarifying questions, mostly about ordering of changes, as it looks like this requires some chart changes

pkg/diagnostic/kms/check.go Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
pkg/diagnostic/kms/check.go Outdated Show resolved Hide resolved
Copy link

@roberthocking roberthocking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good, just have the question about ksm vs kms

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit - should this be ksm instead of kms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's just things were before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool if I tackle this in the next PR?

Copy link
Contributor

@dmepham dmepham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@bdrennz bdrennz merged commit 53b74ac into feature/cp-23050 Dec 3, 2024
4 checks passed
@bdrennz bdrennz deleted the cp-23050-4 branch December 3, 2024 02:04
evan-cz pushed a commit that referenced this pull request Jan 13, 2025
* CP-23425: update records incrementally (#65)

* writer updates max of 500 records at a time

* CP-23425: retry on remote write (#63)

* CP-23425: add ability to toggle individual resource types (#64)

* [CP-23425] update log line to prevent nil pointer exception, increase k8s client qps (#67)

* fix log line

* increase qps allowed by k8s client

* add busy timeout to db driver

* add new test for empty results

* Release 0.0.2 release notes (#66)

* Add 0.0.2 release notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants