Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-26005: add annotations to init-cert, add better logic for updating certificates #167

Merged
merged 12 commits into from
Feb 17, 2025

Conversation

dmepham
Copy link
Collaborator

@dmepham dmepham commented Feb 15, 2025

Description

our users manage this chart with helm, argo, flux, etc, and we need to make sure that in all cases, we run our init jobs at appropriate times. this change allows the init-cert job to run at any time and do the right thing - either update certificates or leave them if existing certs are valid

  • configurable ttl for jobs
  • annotations for init-cert job
  • chart version is now in the init-cert job name
  • init-cert job validates that certificate is valid and generates based on that

Testing

  1. assert that upgrades within the TTL of the jobs do not fail ✅
  2. assert that upgrades after the TTL of the jobs do not fail ✅
  3. assert that init-cert job corrects cert-mismatch between webhook configs ✅
  • manually add a new caBundle value to the pod webhook configuration
  • after upgrading, we get the init-cert job running:
cloudzero-agent-webhook-server-init-cert-1-0-0-dev-6-bx6qg init-cert Mismatch found between ValidatingWebhookConfiguration caBundle values.

and then, correct validation logs:

cloudzero-agent-webhook-server-56cd898875-hb7wr webhook-server I0216 21:11:44.454046       1 handler.go:92] Webhook [/validate/pod - CREATE] - Allowed: true
  1. assert that init-cert job corrects cert with incorrect SAN ✅
  • manually load tls cert with incorrect san address to the tls secret
  • after upgrading, we get the init-cert identifying the issue:
cloudzero-agent-webhook-server-init-cert-1-0-0-dev-7-ql8nm init-cert The SANs in the certificate do not match the service name.

and valid webhook server logs:

cloudzero-agent-webhook-server-54b9fb86d4-xx5zd webhook-server I0216 21:19:44.096625       1 handler.go:92] Webhook [/validate/pod - CREATE] - Allowed: true
  1. assert that init-cert job corrects cert when webhook caBundle and ca.crt in the tls secret to do match
  • manually added a new certificate to the tls secret
  • after running an upgrade we get a new init-cert job, which identifies the issue and fixes it
cloudzero-agent-webhook-server-init-cert-1-0-0-dev-5-kpbtn init-cert The caBundle in the ValidatingWebhookConfiguration does not match the tls.crt in the TLS Secret.

and, we see webhook server pods handling successfully ✅

cloudzero-agent-webhook-server-d54db997c-7cxp6 webhook-server I0216 21:00:18.259469       1 handler.go:92] Webhook [/validate/pod - CREATE] - Allowed: true
  1. assert that an ArgoCD deployment can be set up such that deleted Jobs do not put the application into an OutOfSync state.
    a. deployed successfully
    Screenshot 2025-02-16 at 3 08 18 PM

b. out of state after job TTL expires
Screenshot 2025-02-16 at 3 10 21 PM

c. Deploy again with the following set:

initBackfillJob:
  annotations:
    argocd.argoproj.io/hook: PostSync
    argocd.argoproj.io/hook-delete-policy: HookSucceeded
initCertJob:
  annotations:
    argocd.argoproj.io/hook: PostSync
    argocd.argoproj.io/hook-delete-policy: HookSucceeded

d. stays in sync:
Screenshot 2025-02-16 at 3 16 06 PM

Checklist

  • I have added documentation for new/changed functionality in this PR
  • All active GitHub checks for tests, formatting, and security are passing
  • The correct base branch is being used, if not main

@dmepham dmepham requested a review from a team as a code owner February 15, 2025 00:59
@dmepham dmepham merged commit 40745a8 into develop Feb 17, 2025
2 checks passed
@dmepham dmepham deleted the CP-26005 branch February 17, 2025 13:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants