Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-19492: golang based cloudzero-chart validator #1

Merged
merged 9 commits into from
Jul 10, 2024

Conversation

josephbarnett
Copy link
Collaborator

@josephbarnett josephbarnett commented Jul 7, 2024

Description of the change

The current python based validator does cannot be packaged as a binary and run in the prometheus container because the container is running in busybox (and later will be scratch) which does not have libc.

Since we don't want to own the image for prometheus, we need a single binary which can perform the validation checks without requiring any external libraries or application dependencies. This is where golang really shines.

This commit provides:

  1. A validator utility which can perform various checks based on a configuration
  2. Is a self contained binary
  3. Provides CICD workflows for testing, and packaging
  4. Provides unit tests to validate all functionality
  5. Provides README guides for Usage, Configuration, Development and Testing
  6. Conforms to the Cloudzero OSS requirements

Type of change

  • New feature

Development

  • All changed code has 80% unit test coverage
  • All changed code has been automatically (smoke test or otherwise) or manually verified in alfa (or with a cross namespace setup, e.g. developer namespace for this feature, pointing at shared alfa resources)

Code review

  •  This pull request has a title that includes the JIRA Ticket and a short useful summary, e.g. CP-4051: Create TEMPLATE Feature Repo.

Trying it yourself

  1. Follow this guide included in the repository!
go mod download
make fmt lint test build
  1. After that, you should have a binary in bin/cloudzero-agent-validator; you can do the following:

Generate a configuration file

./bin/cloudzero-agent-validator config generate -account 1234 -cluster foo -region us-east-1 > myconfig.yml
  1. Make a could of files which you will point at in the myconfig.yml file:
    1. echo $CZ_API_DEV_TOKEN > credentials_file
    2. now update myconfig.yml with credentials_file: credentials_file
    3. change the value for configurations: in myconfig.yml from:
      • FROM /etc/config/prometheus/configmaps/prometheus.yml
      • TO: credentials_file
    4. Remove the the following checks:
      • kube_state_metrics_reachable
      • node_exporter_reachable
    5. Replace https://api.cloudzero.com with https://dev-api.cloudzero.com

The configuration should look like the following now:

versions:
  chart_version: 0.0.9
  agent_version: latest

logging:
  level: info
  location: ./cloudzero-agent-validator.log

deployment:
  account_id:  1234
  cluster_name:  foo
  region:  us-east-1

cloudzero:
  host:  https://dev-api.cloudzero.com
  credentials_file: credentials_file

prometheus:
  kube_state_metrics_service_endpoint:  http://kube-state-metrics.your-namespace.svc.cluster.local:8080
  prometheus_node_exporter_service_endpoint:  http://node-exporter.your-namespace.svc.cluster.local:9100
  configurations:
    - credentials_file

diagnostics:
  stages:
    - name: pre-start
      enforce: true
      checks:
        - egress_reachable
        - api_key_valid
    - name: post-start
      enforce: false
      checks:
        - k8s_version
        # - kube_state_metrics_reachable
        # - node_exporter_reachable
        - scrape_cfg
    - name: pre-stop
      enforce: false
      checks:
  1. Run some checks to verify the app works as expected:
./bin/cloudzero-agent-validator config validate -f myconfig.yml
./bin/cloudzero-agent-validator d get-available
./bin/cloudzero-agent-validator d run -f myconfig.yml -check egress_reachable  --post
./bin/cloudzero-agent-validator d run -f myconfig.yml -check k8s_version  --post
./bin/cloudzero-agent-validator d run -f myconfig.yml -check scrape_cfg  --post
./bin/cloudzero-agent-validator d run -f myconfig.yml -check api_key_valid  --post

NOTE: --post is sending the data to the API and it should be visible in DyanmoDB. Remove it if you just want to test

Additional notes for testing.

  • It is possible to test against a private namespaces endpoint by setting host to the endpoint, such as host: https://jau6vy3ur7.execute-api.us-east-1.amazonaws.com/jb then use the run command to send data for a single test from the local machine:

     ./bin/cloudzero-agent-validator d run -f myconfig.yml -check k8s_version --post
  • you can then look at the data in the DB - for example:
    Screenshot 2024-07-09 at 10 12 32 AM

@josephbarnett josephbarnett marked this pull request as ready for review July 8, 2024 18:47
.github/workflows/events/main-push-event.json Outdated Show resolved Hide resolved
.github/workflows/docker-build.yml Show resolved Hide resolved
.github/workflows/docker-build.yml Show resolved Hide resolved
.github/workflows/docker-build.yml Show resolved Hide resolved
.github/workflows/docker-build.yml Show resolved Hide resolved
@josephbarnett josephbarnett merged commit 626e06f into develop Jul 10, 2024
4 checks passed
@josephbarnett josephbarnett deleted the cp-19492-import-new-agent branch July 10, 2024 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants