Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add policy Collect Debug Information for Pods in CrashLoopBackOff #1086

Merged
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
2296025
Updated files for get-debug-information policy
nsagark Jul 26, 2024
d56cf51
Updated the policy with annotations and also the artifacthub-pkg.yml
nsagark Jul 29, 2024
40977bf
Updated policy for image information and also the artifacthub-pkg.yml
nsagark Jul 30, 2024
e78569f
Update other/get-debug-information/generate-policy.yaml
nsagark Aug 2, 2024
ed5ea20
Update other/get-debug-information/generate-policy.yaml
nsagark Aug 2, 2024
671cb72
Update other/get-debug-information/generate-policy.yaml
nsagark Aug 2, 2024
05560d3
Updated the policy, renamed the policy and the digest in the artifact…
nsagark Aug 3, 2024
e49706e
Updated the policy file name in the chainsaw-test.yaml
nsagark Aug 3, 2024
29b4b12
Updated the policy file name in the artifacthub-pkg.yml
nsagark Aug 3, 2024
edbce71
Update other/get-debug-information/collect-debug-information.yaml
nsagark Aug 7, 2024
533ee09
Updated the digest and also the policy to add serviceaccount information
nsagark Aug 7, 2024
43c7004
Updated the description and readme in artifacthub-pkg.yml
nsagark Aug 7, 2024
245cb87
Merge branch 'main' into collect-debug-information-crashloopback
chipzoller Aug 9, 2024
4187f53
Updated the policy file name
nsagark Aug 9, 2024
7bcf6f2
Merge branch 'main' into collect-debug-information-crashloopback
chipzoller Aug 9, 2024
fedeb1b
Update other/get-debug-information/.chainsaw-test/chainsaw-test.yaml
chipzoller Aug 9, 2024
c0faef7
Update other/get-debug-information/artifacthub-pkg.yml
chipzoller Aug 9, 2024
e1729f2
Merge branch 'main' into collect-debug-information-crashloopback
chipzoller Aug 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: background-controller
app.kubernetes.io/instance: kyverno
app.kubernetes.io/part-of: kyverno
name: kyverno:background-controller-generate
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create", "get", "list", "watch", "update", "delete"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "events"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: read-pods-rolebinding
subjects:
- kind: Group
name: system:serviceaccounts
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: pod-reader
apiGroup: rbac.authorization.k8s.io
48 changes: 48 additions & 0 deletions other/get-debug-information/.chainsaw-test/chainsaw-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/kyverno/chainsaw/main/.schemas/json/test-chainsaw-v1alpha1.json
apiVersion: chainsaw.kyverno.io/v1alpha1
kind: Test
metadata:
creationTimestamp: null
name: get-debug-data
spec:
steps:
- name: step-00
try:
- apply:
file: chainsaw-step-00-apply-1.yaml
- apply:
file: chainsaw-step-00-apply-2.yaml
- name: step-01
try:
- script:
content: |
if kubectl get configmap kyverno -n kyverno -o jsonpath='{.data.excludeGroups}' | grep -q 'system:nodes'; then
kubectl patch configmap kyverno -n kyverno --type=json -p='[{"op": "remove", "path": "/data/excludeGroups"}]'
else
echo "excludeGroups: system:nodes does not exist in the configmap."
fi
- name: step-02
try:
- apply:
file: ../collect-debug-information.yaml
chipzoller marked this conversation as resolved.
Show resolved Hide resolved
- assert:
file: policy-ready.yaml
- name: step-03
try:
- apply:
file: ns.yaml
- apply:
file: depl-readonlyrootfs.yaml
- name: step-04
try:
- sleep:
duration: 60s
- assert:
resource:
apiVersion: batch/v1
kind: Job
metadata:
labels:
app.kubernetes.io/managed-by: kyverno
deleteme: allow
namespace: abc
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: abc
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx-container
image: nginx:latest
ports:
- containerPort: 80
securityContext:
readOnlyRootFilesystem: true
4 changes: 4 additions & 0 deletions other/get-debug-information/.chainsaw-test/ns.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
apiVersion: v1
kind: Namespace
metadata:
name: abc
6 changes: 6 additions & 0 deletions other/get-debug-information/.chainsaw-test/policy-ready.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: get-debug-data-policy
status:
ready: true
20 changes: 20 additions & 0 deletions other/get-debug-information/artifacthub-pkg.yml
nsagark marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: get-debug-information
version: 1.0.0
displayName: Collect debug information for pods in crashloopback
createdAt: "2024-07-25T20:30:05.000Z"
description: "This policy generates a job which gathers troubleshooting data (including logs, kubectl describe output and events from the namespace) from pods that are in crashloopback and have 3 restarts. This data can further be used to automatically create a jira issue using some kind of automation or another Kyverno policy. For more information on the permissions needed to run the job created by the policy and how the image used in this policy was built, see https://github.com/nirmata/SRE-Operational-Usecases/tree/main/get-troubleshooting-data/get-debug-data."
install: |-
```shell
kubectl apply -f https://raw.githubusercontent.com/kyverno/policies/main/other/get-debug-information/collect-debug-information.yaml
nsagark marked this conversation as resolved.
Show resolved Hide resolved
```
keywords:
- kyverno
- Sample
readme: |
This policy generates a job which gathers troubleshooting data (including logs, kubectl describe output and events from the namespace) from pods that are in crashloopback and have 3 restarts. This data can further be used to automatically create a jira issue using some kind of automation or another Kyverno policy. For more information on the permissions needed to run the job created by the policy and how the image used in this policy was built, see https://github.com/nirmata/SRE-Operational-Usecases/tree/main/get-troubleshooting-data/get-debug-data.

Refer to the documentation for more details on Kyverno annotations: https://artifacthub.io/docs/topics/annotations/kyverno/
annotations:
kyverno/category: "Sample"
kyverno/subject: "Pod"
digest: 88b237e3eee9dface6d34ae3a6c93106db988cd9d9b68f3f6ae3de542037d8e1
nsagark marked this conversation as resolved.
Show resolved Hide resolved
82 changes: 82 additions & 0 deletions other/get-debug-information/collect-debug-information.yaml
nsagark marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: get-debug-data-policy
annotations:
policies.kyverno.io/title: Collect Debug Information for Pods in CrashLoopBackOff
policies.kyverno.io/category: Other
policies.kyverno.io/severity: medium
policies.kyverno.io/subject: Pod
kyverno.io/kyverno-version: 1.11.5
kyverno.io/kubernetes-version: "1.27"
policies.kyverno.io/description: >-
This policy generates a job which gathers troubleshooting data (including logs, kubectl describe output and events from the namespace) from pods that are in CrashLoopBackOff and have 3 restarts. This data can further be used to automatically create a Jira issue using some kind of automation or another Kyverno policy. For more information on the image used in this policy, see https://github.com/nirmata/SRE-Operational-Usecases/tree/main/get-troubleshooting-data/get-debug-data.
nsagark marked this conversation as resolved.
Show resolved Hide resolved
spec:
rules:
- name: get-debug-data-policy-rule
match:
any:
- resources:
kinds:
- v1/Pod.status
context:
- name: pdcount
apiCall:
urlPath: "/api/v1/namespaces/{{request.namespace}}/pods?labelSelector=requestpdname=pod-{{request.object.metadata.name}}"
jmesPath: "items | length(@)"
preconditions:
all:
- key: "{{ sum(request.object.status.containerStatuses[*].restartCount || `0`) }}"
operator: Equals
value: 3
- key: "{{ request.object.metadata.labels.deleteme || 'empty' }}"
operator: Equals
value: "empty"
- key: "{{ pdcount }}"
operator: Equals
value: 0
generate:
apiVersion: batch/v1
kind: Job
name: get-debug-data-{{request.object.metadata.name}}-{{ random('[0-9a-z]{8}') }}
namespace: "{{request.namespace}}"
synchronize: false
data:
metadata:
labels:
deleteme: allow
spec:
template:
metadata:
labels:
app: my-app
deleteme: allow
requestpdname: "pod-{{request.object.metadata.name}}"
spec:
restartPolicy: OnFailure
containers:
- name: my-container
image: sagarkundral/my-python-app:v52
ports:
- containerPort: 8080
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: token
readOnly: true
args:
- "/app/get-debug-jira-v2.sh"
- "{{request.namespace}}"
- "{{request.object.metadata.name}}"
volumes:
- name: token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
Loading