Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/modules/ROOT/pages/framework/backfill_billing.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ In accordance to the https://git.vshn.net/aline.abler/scriptofdoom[scriptofdoom.
while read -r cronjob rest
do
echo $cronjob
kubectl --as cluster-admin -n syn-appcat create job --from cronjob/$cronjob $cronjob --dry-run -oyaml | yq e '.spec.template.spec.containers[0].args[0] = "appuio-reporting report --timerange 1h --begin=$(date -d \"now -12 hours\" -u +\"%Y-%m-%dT%H:00:00Z\") --repeat-until=$(date -u +\"%Y-%m-%dT%H:00:00Z\")"' | kubectl --as cluster-admin apply -f -
done <<< "$(kubectl --as cluster-admin -n syn-appcat get cronjobs.batch --no-headers)"
kubectl --as=system:admin -n syn-appcat create job --from cronjob/$cronjob $cronjob --dry-run -oyaml | yq e '.spec.template.spec.containers[0].args[0] = "appuio-reporting report --timerange 1h --begin=$(date -d \"now -12 hours\" -u +\"%Y-%m-%dT%H:00:00Z\") --repeat-until=$(date -u +\"%Y-%m-%dT%H:00:00Z\")"' | kubectl --as=system:admin apply -f -
done <<< "$(kubectl --as=system:admin -n syn-appcat get cronjobs.batch --no-headers)"
----

This will loop over all the billing cronjobs in the `syn-appcat`, create a new job from them and replace the args with whatever we want.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,14 +7,14 @@ This alert is based on our SLI Exporter and how we in Appcat measure uptime of o

== icon:bug[] Steps for Debugging

There is no obvious reason why it happend, but we can easily check what happened. Evevry "guaranteed_availability" database has at least 2 replicas and PodDistruptionBudget set to 1. So, if one replica is down, the second one should be up and running. If that failed it means that there is some issue with the database or node itself.
There is no obvious reason why it happend, but we can easily check what happened. Evevry "guaranteed_availability" database has at least 2 replicas and PodDistruptionBudget set to 1. So, if one replica is down, the second one should be up and running. If that failed it means that there is some issue with the database or node itself.

.Finding the failed database
Check database name and namespace from alert. There are 2 relevant namespaces: claim namespace and instance namespace. Instance namespace is generated and always has format "vshn-<service_name(postgresql, redis, (...etc))>-<instance_name>".

[source,bash]
----
kubectl -n $instanceNamespace get pods
kubectl -n $instanceNamespace get pods
kubectl -n $instanceNamespace describe $failing_pod
kubectl -n $instanceNamespace logs pods/$failing_pod
----
Expand All @@ -23,9 +23,9 @@ It might be also worth checking for failing Kubernetes Objects and Composite:
[source,bash]
----
#$instanceNamespace_generated_chars can be obtained in a way: `echo vshn-postgresql-my-super-prod-5jfjn | rev | cut -d'-' -f1 | rev` ===> 5jfjn
kubectl --as cluster-admin get objects | egrep $instanceNamespace_generated_chars
kubectl --as cluster-admin describe objects $objectname
kubectl --as cluster-admin describe xvshn[TAB here for specific service] | egrep $instanceNamespace_generated_chars
kubectl --as=system:admin get objects | egrep $instanceNamespace_generated_chars
kubectl --as=system:admin describe objects $objectname
kubectl --as=system:admin describe xvshn[TAB here for specific service] | egrep $instanceNamespace_generated_chars
----

.Check SLI Prober logs
Expand Down Expand Up @@ -65,7 +65,7 @@ Possible reasons for failing SLI Prober:

[source,bash]
-----
Details:
Details:
OnCall : true
alertname : vshn-vshnpostgresql-GuaranteedUptimeTarget

Expand All @@ -88,4 +88,4 @@ After You receive such alert on email, you can easily check interesting informat

* instance namespace: `vshn-postgresql-postgresql-analytics-kxxxa`
* instanceNamespace_GeneratedChars: `kxxxa`
* claim namespace: `postgresql-analytics-db`
* claim namespace: `postgresql-analytics-db`
4 changes: 2 additions & 2 deletions docs/modules/ROOT/pages/service/mariadb/restore.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ To restore a VSHNMariaDB backup the following tools are needed:
* https://github.com/mfuentesg/ksd[GitHub - mfuentesg/ksd: kubernetes secret decoder]
* https://k8up.io/[K8up]
* https://restic.net/[Restic]
* `alias k=kubectl` && `alias ka='kubectl --as cluster-admin`
* `alias k=kubectl` && `alias ka='kubectl --as=system:admin`

== Acquiring VSHNMariaDB backup

Locate instance namespace of VSHNMariaDB You want to backup:
`k -n vshn-test get vshnmariadbs.vshn.appcat.vshn.io vshn-testing -o yaml | grep instanceNamespace` and for convinience use `kubens` using new namespace
Depending on a cluster configuration it might be necessary for You to use all other commands using `kubectl --as cluster-admin` especially on Appuio Cloud
Depending on a cluster configuration it might be necessary for You to use all other commands using `kubectl --as=system:admin` especially on Appuio Cloud
There are two important secrets in instance namespace:
* backup-bucket-credentials
* k8up-repository-password
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ From the said alert the customer namespace can be deducted together with the nam

[source,bash]
----
kubectl get XVSHNPostgreSQL <name-from-alert> --as cluster-admin
kubectl get XVSHNPostgreSQL <name-from-alert> --as=system:admin
----

NOTE: The XRD is protected in case the deletion protection is on.
Expand All @@ -28,7 +28,7 @@ The instance namespace is hidden from the customer.

[source,bash]
----
kubectl get XVSHNPostgreSQL <name-from-alert> -o=jsonpath='{.status.instanceNamespace}' --as cluster-admin
kubectl get XVSHNPostgreSQL <name-from-alert> -o=jsonpath='{.status.instanceNamespace}' --as=system:admin
----

[WARNING]
Expand Down Expand Up @@ -142,7 +142,7 @@ In case there is no secret it has to be recreated with the credentials from the
+
[source,bash]
----
kubectl get XObjectBucket --as cluster-admin <name>
kubectl get XObjectBucket --as=system:admin <name>
----
<7> THe bucket name
<8> S3 Cloud provider endpoint
Expand All @@ -167,4 +167,4 @@ To check the restore process itself use the following command:
[source,bash]
----
kubectl -n <instance-namespace> logs <pod-name> -f
----
----
Loading