-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add policy Collect Debug Information for Pods in CrashLoopBackOff #1086
Add policy Collect Debug Information for Pods in CrashLoopBackOff #1086
Conversation
Signed-off-by: nsagark <sagar@nirmata.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this policy? There are no annotations which tell us anything here.
Signed-off-by: nsagark <sagar@nirmata.com>
Hi @chipzoller I have added the annotations for this policy. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand what you're trying to go for here, but offering a policy which creates Pods based on some unknown container image in a cluster (i.e., not a Kyverno project image which can be inspected) seems like a questionable approach. I can see how this might be valuable, but I think this needs some more transparency to be trustworthy.
Hi @chipzoller Do you want me to push this image to ghcr.io or let me know if you have any suggestions? |
My suggestion would be to take your idea and create a public GitHub project repository from it and build/push the image there. Then, put this information in the description of the policy so users can inspect/evaluate what the proposed image does to determine if it's safe for them to consume this policy. |
Signed-off-by: nsagark <sagar@nirmata.com>
Hi @chipzoller I have added the information on how the image was built by adding a link in the description of the policy. Please review and let me know if we need to add anything else. |
Hi @chipzoller Please let me know if anything else needed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there are missing permissions here to run your Job. You're mounting a ServiceAccount token but aren't declaring a ServiceAccount name. And that SA has to be granted permissions to get Pods basically anywhere in the cluster. I don't see how this could possibly work without the user taking additional steps on their end, which isn't mentioned in the description.
Co-authored-by: Chip Zoller <chipzoller@gmail.com> Signed-off-by: nsagark <90008930+nsagark@users.noreply.github.com>
Co-authored-by: Chip Zoller <chipzoller@gmail.com> Signed-off-by: nsagark <90008930+nsagark@users.noreply.github.com>
Co-authored-by: Chip Zoller <chipzoller@gmail.com> Signed-off-by: nsagark <90008930+nsagark@users.noreply.github.com>
For this comment, can I update the description like below? I have added the permissions needed in the README in the URL provided. Please let me know if that will suffice? This policy generates a job which gathers troubleshooting data (including logs, kubectl describe output and events from the namespace) from pods that are in CrashLoopBackOff and have 3 restarts. This data can further be used to automatically create a Jira issue using some kind of automation or another Kyverno policy. For more information on the permissions needed to run this policy and the image used in this policy, see https://github.com/nirmata/SRE-Operational-Usecases/tree/main/get-troubleshooting-data/get-debug-data. |
…hub-pkg.yml Signed-off-by: nsagark <sagar@nirmata.com>
Signed-off-by: nsagark <sagar@nirmata.com>
Signed-off-by: nsagark <sagar@nirmata.com>
Hi @chipzoller I have made the changes that were suggested. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the suggestion, why don't you add the service account field to the Job along with a comment so it's more clear the Job does nothing without the RBAC.
Co-authored-by: Chip Zoller <chipzoller@gmail.com> Signed-off-by: nsagark <90008930+nsagark@users.noreply.github.com>
Signed-off-by: nsagark <sagar@nirmata.com>
Hi @chipzoller I have updated the files as per your suggestion, please take a look. |
Signed-off-by: nsagark <sagar@nirmata.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, just need to fix the Artifact Hub linting error which I think is due to policy filename mismatch.
Signed-off-by: nsagark <sagar@nirmata.com>
Hi @chipzoller I have updated the files. Please take a look. |
Signed-off-by: Chip Zoller <chipzoller@gmail.com>
Signed-off-by: Chip Zoller <chipzoller@gmail.com>
@chipzoller I have added the changes that were requested. Please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go now.
Related Issue(s)
Description
Adds a new policy called "Collect Debug Information for Pods in CrashLoopBackOff" along with Chainsaw tests.
Checklist