-
Notifications
You must be signed in to change notification settings - Fork 11
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add concepts page on Pod disruptions (#454)
* Add concepts page on Pod disruptions * Adopt to new structure * rename file * move file * add page alias * typo * add warnings * Apply suggestions from code review Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de> * review * review * Add overview page for operations * typo * avoid we * review * Rename to "Allowed Pod disruption" * Apply suggestions from code review Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de> --------- Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
- Loading branch information
Showing
8 changed files
with
157 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
...es/concepts/pages/cluster_operations.adoc → .../pages/operations/cluster_operations.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
= Operations | ||
|
||
This section of the documentation is intended for the operations teams that maintain a Stackable Data Platform installation. | ||
It provides you with the necessary details to operate it in a production environment. | ||
|
||
== Service availability | ||
|
||
Make sure to go through the following checklist to achieve the maximum level of availability for your services. | ||
|
||
1. Make setup highly available (HA): In case the product supports running in an HA fashion, our operators will automatically | ||
configure it for you. You only need to make sure that you deploy a sufficient number of replicas. Please note that | ||
some products don't support HA. | ||
2. Reduce the number of simultaneous pod disruptions (unavailable replicas). The Stackable operators write defaults | ||
based upon knowledge about the fault tolerance of the product, which should cover most of the use-cases. For details | ||
have a look at xref:operations/pod_disruptions.adoc[]. | ||
3. Reduce impact of pod disruption: Many HA capable products offer a way to gracefully shut down the service running | ||
within the Pod. The flow is as follows: Kubernetes wants to shut down the Pod and calls a hook into the Pod, which in turn | ||
interacts with the product, telling it to gracefully shut down. The final deletion of the Pod is then blocked until | ||
the product has successfully migrated running workloads away from the Pod that is to be shut down. Details covering the graceful shutdown mechanism are described in the actual operator documentation. | ||
+ | ||
WARNING: Graceful shutdown is not implemented for all products yet. Please check the documentation specific to the product operator to see if it is supported (such as e.g. xref:trino:usage_guide/operations/graceful-shutdown.adoc[the documentation for Trino]. | ||
|
||
4. Spread workload across multiple Kubernetes nodes, racks, datacenter rooms or datacenters to guarantee availability | ||
in the case of e.g. power outages or fire in parts of the datacenter. All of this is supported by | ||
configuring an https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/[antiAffinity] as documented in | ||
xref:operations/pod_placement.adoc[] | ||
|
||
== Maintenance actions | ||
|
||
Sometimes you want to quickly shut down a product or update the Stackable operators without all the managed products | ||
restarting at the same time. You can achieve this using the following methods: | ||
|
||
1. Quickly stop and start a whole product using `stopped` as described in xref:operations/cluster_operations.adoc[]. | ||
2. Prevent any changes to your deployed product using `reconcilePaused` as described in xref:operations/cluster_operations.adoc[]. | ||
|
||
== Performance | ||
|
||
1. You can configure the available resource every product has using xref:concepts:resources.adoc[]. The defaults are | ||
very restrained, as you should be able to spin up multiple products running on your Laptop. | ||
2. You can not only use xref:operations/pod_placement.adoc[] to achieve more resilience, but also to co-locate products | ||
that communicate frequently with each other. One example is placing HBase regionservers on the same Kubernetes node | ||
as the HDFS datanodes. Our operators already take this into account and co-locate connected services. However, if | ||
you are not satisfied with the automatically created affinities you can use ref:operations/pod_placement.adoc[] to | ||
configure your own. | ||
3. If you want to have certain services running on dedicated nodes you can also use xref:operations/pod_placement.adoc[] | ||
to force the Pods to be scheduled on certain nodes. This is especially helpful if you e.g. have Kubernetes nodes with | ||
16 cores and 64 GB, as you could allocate nearly 100% of these node resources to your Spark executors or Trino workers. | ||
In this case it is important that you https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/[taint] | ||
your Kubernetes nodes and use xref:overrides.adoc#pod-overrides[podOverrides] to add a `toleration` for the taint. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,94 @@ | ||
= Allowed Pod disruptions | ||
|
||
Any downtime of our products is generally considered to be bad. | ||
Although downtime can't be prevented 100% of the time - especially if the product does not support High Availability - we can try to do our best to reduce it to an absolute minimum. | ||
|
||
Kubernetes has mechanisms to ensure minimal *planned* downtime. | ||
Please keep in mind, that this only affects planned (voluntary) downtime of Pods - unplanned Kubernetes node crashes can always occur. | ||
|
||
Our product operator will always deploy so-called https://kubernetes.io/docs/tasks/run-application/configure-pdb/[PodDisruptionBudget (PDB)] resources alongside the products. | ||
For every role that you specify (e.g. HDFS namenodes or Trino workers) a PDB is created. | ||
|
||
== Default values | ||
The defaults depend on the individual product and can be found below the "Operations" usage guide. | ||
|
||
They are based on our knowledge of each product's fault tolerance. | ||
In some cases they may be a little pessimistic, but they can be adjusted as documented in the following sections. | ||
|
||
== Influencing and disabling PDBs | ||
|
||
You can configure | ||
|
||
1. Whether PDBs are written at all | ||
2. The `maxUnavailable` replicas for this role PDB | ||
|
||
The following example | ||
|
||
1. Sets `maxUnavailable` for NameNodes to `1` | ||
2. Sets `maxUnavailable` for DataNodes to `10`, which allows downtime of 10% of the total DataNodes. | ||
3. Disables PDBs for JournalNodes | ||
|
||
[source,yaml] | ||
---- | ||
apiVersion: hdfs.stackable.tech/v1alpha1 | ||
kind: HdfsCluster | ||
metadata: | ||
name: hdfs | ||
spec: | ||
nameNodes: | ||
roleConfig: # optional, only supported on role level, *not* on rolegroup | ||
podDisruptionBudget: # optional | ||
enabled: true # optional, defaults to true | ||
maxUnavailable: 1 # optional, defaults to our "smart" calculation | ||
roleGroups: | ||
default: | ||
replicas: 3 | ||
dataNodes: | ||
roleConfig: | ||
podDisruptionBudget: | ||
maxUnavailable: 10 | ||
roleGroups: | ||
default: | ||
replicas: 100 | ||
journalnodes: | ||
roleConfig: | ||
podDisruptionBudget: | ||
enabled: false | ||
roleGroups: | ||
default: | ||
replicas: 3 | ||
---- | ||
|
||
== Using you own custom PDBs | ||
In case you are not satisfied with the PDBs that are written by the operators, you can deploy your own. | ||
|
||
WARNING: In case you write custom PDBs, it is your responsibility to take care of the availability of the products | ||
|
||
IMPORTANT: It is important to disable the PDBs created by the Stackable operators as described above before creating your own PDBs, as this is a https://github.com/kubernetes/kubernetes/issues/75957[limitation of Kubernetes]. | ||
|
||
*After disabling the Stackable PDBs*, you can deploy you own PDB such as | ||
|
||
[source,yaml] | ||
---- | ||
apiVersion: policy/v1 | ||
kind: PodDisruptionBudget | ||
metadata: | ||
name: hdfs-journalnode-and-namenode | ||
spec: | ||
maxUnavailable: 1 | ||
selector: | ||
matchLabels: | ||
app.kubernetes.io/name: hdfs | ||
app.kubernetes.io/instance: hdfs | ||
matchExpressions: | ||
- key: app.kubernetes.io/component | ||
operator: In | ||
values: | ||
- journalnode | ||
- namenode | ||
---- | ||
|
||
This PDB allows only one Pod out of all the Namenodes and Journalnodes to be down at one time. | ||
|
||
== Details | ||
Have a look at <<< TODO: link ADR on Pod Disruptions once merged >>> for the implementation details. |
3 changes: 2 additions & 1 deletion
3
modules/concepts/pages/pod_placement.adoc → ...cepts/pages/operations/pod_placement.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters