Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(clouddriver): make fetching properties file more resilient for k8s jobs #4783

Merged
merged 4 commits into from
Sep 27, 2024

Conversation

kirangodishala
Copy link
Contributor

Spinnaker doesn't handle retrieving property file contents properly when a job has multiple pods. Under the covers, it runs the kubectl logs jobs/ command, which doesn't give the right answer if there are two pods running at the same time for the same job. In such cases, kubernetes always defaults to the first pod, but in certain scenarios, we would want it to query the second pod for logs as the first one may have failed but the second one may have succeeded.

This PR enables getting logs from a successful pod directly in case the original kubectl logs jobs call failed.

This PR relies on spinnaker/clouddriver#5184 & spinnaker/clouddriver#5778 to enable it to get logs from a pod directly.

@dbyron-sf dbyron-sf marked this pull request as draft September 25, 2024 23:16
…8s jobs

If a k8s run job is marked as succeeded, and property file is defined in the stage context,
then it can so happen that multiple pods are created for that job.
 See https://kubernetes.io/docs/concepts/workloads/controllers/job/#handling-pod-and-container-failures

In extreme edge cases, the first pod may be around before the second one succeeds.
That leads to kubectl logs job/ command failing as seen below:
kubectl -n test logs job/test-run-job-5j2vl -c parser
Found 2 pods, using pod/test-run-job-5j2vl-fj8hd
Error from server (BadRequest): container "parser" in pod "test-run-job-5j2vl-fj8hd" is terminated
or
Found 2 pods, using pod/test-run-job-5j2vl-fj8hd
Error from server (BadRequest): container "parser" in pod "test-run-job-5j2vl-fj8hd" is waiting to start: PodInitializing

where that commands defaults to using one of the two pods.

To fix this issue, if we encounter an error from the kubectl logs job/ command, we
find a successful pod in the job and directly query it for logs.
@kirangodishala kirangodishala force-pushed the multiple-pod-runjob-case branch from f93afd4 to 7b27660 Compare September 27, 2024 10:23
@dbyron-sf
Copy link
Contributor

@Mergifyio update

Copy link
Contributor

mergify bot commented Sep 27, 2024

update

✅ Branch has been successfully updated

@dbyron-sf dbyron-sf added the ready to merge Approved and ready for merge label Sep 27, 2024
@dbyron-sf dbyron-sf marked this pull request as ready for review September 27, 2024 19:49
@mergify mergify bot added the auto merged Merged automatically by a bot label Sep 27, 2024
@mergify mergify bot merged commit 9428996 into spinnaker:master Sep 27, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto merged Merged automatically by a bot ready to merge Approved and ready for merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants