|
| 1 | +--- |
| 2 | +title: Recommended Kubernetes cluster sizing |
| 3 | +description: The recommended number of nodes and compute per pod in your Rhize Kubernetes cluster |
| 4 | +--- |
| 5 | + |
| 6 | +Rhize runs on Kubernetes. |
| 7 | + |
| 8 | +This document provides compute recommendations for the nodes, pods services of your [Rhize Install]({{< relref "install" >}}). |
| 9 | +Some services also have recommended replication factors to increase reliability. |
| 10 | + |
| 11 | +## Node recommendations |
| 12 | + |
| 13 | +The following tables are the minimum recommended sizes to provision your cluster for Rhize {{% param v %}}. |
| 14 | + |
| 15 | +### Rhize nodes |
| 16 | + |
| 17 | +For high availability, Rhize recommends a **minimum of three nodes** with the following specifications. |
| 18 | + |
| 19 | +| Property | Value | |
| 20 | +|-----------------------|-------------------| |
| 21 | +| Number of nodes | 3 | |
| 22 | +| CPU Speed (GHz) | 3.3 | |
| 23 | +| vCPU per Node | 16 | |
| 24 | +| Memory per node (GiB) | 32 (64 is better) | |
| 25 | +| Persisted volumes | 16 | |
| 26 | +| Persisted Volume IOPS | 5000 | |
| 27 | +| PV Throughput (MBps) | 500 | |
| 28 | +| Total Disk Space (TB) | 3 | |
| 29 | +| Disk IOPS | 5000 | |
| 30 | +| Disk MBps | 500 | |
| 31 | + |
| 32 | +### Rhize agent |
| 33 | + |
| 34 | +The Rhize agent typically runs on the edge, outside of the cluster entirely. |
| 35 | +For the Rhize Agent, the minimum recommended specifications are as follows: |
| 36 | + |
| 37 | +| Property | Value | |
| 38 | +|-----------------------|-------| |
| 39 | +| CPU Speed (GHz) | 2.8 | |
| 40 | +| vCPU per Node | 2 | |
| 41 | +| Memory per node (GiB) | 1 | |
| 42 | +| Persisted volumes | 1 | |
| 43 | + |
| 44 | +## Service-level recommendations |
| 45 | + |
| 46 | +The following table lists the **minimum** recommended specifications for the main services. |
| 47 | +Services with stateful PV have a persistent volume per pod. |
| 48 | + |
| 49 | +>![Warn] |
| 50 | +> Avoid NFS or SMB filesystems. These are known to lead to file corruption in BaaS and do not work at all with various other services. |
| 51 | +
|
| 52 | + |
| 53 | +| Service | Pods for HA (replica count) | vCPU per Pod | Memory Per Pod | Stateful PV | DiskSize (GiB) | Comments | |
| 54 | +|------------------------|-----------------------------|--------------|----------------|-------------|----------------|----------------------------------------------------------------------| |
| 55 | +| `baas-alpha` | 3 | 8 | 16 (at least) | Yes | 750 | High throughput and IOPS | |
| 56 | +| `baas-zero` | 3 | 2 | 2 | Yes | 300 | High throughput and IOPS | |
| 57 | +| `workflow` | 3 | 1 | 2 | No | N/A | HA requires 2 pods, but 3 is to avoid hotkey issues and balance load | |
| 58 | +| `isa95` | 2 | 2 | 1 | NO | N/A | | |
| 59 | +| `keycloak-postgres` | 2 | 1 | 2 | No | 200 | Runs in pod with `keycloak` | |
| 60 | +| `keycloak` | 2 | 1 | 2 | No | N/A | | |
| 61 | +| `libre-audit-postgres` | 2 | 1 | 2 | Yes | 250 | Runs in pod with `libre-audit` | |
| 62 | +| `libre-ui` | 3 | 0.25 | 0.25 | No | N/A | | |
| 63 | +| `quest-db` | 1 | 4 | 8 | Yes | 250 | High Throughput and IPOS | |
| 64 | +| `redpanda` | 3 | | | Yes | 100 | High IOPS | |
| 65 | +| `restate` | 3 | | | Yes | 50 | High Throughput and IPOS | |
| 66 | +| `appsmith` | 3 | 4 | | Yes | 50 | High Throughput and IPOS | |
| 67 | + |
| 68 | + |
| 69 | +### Monitoring stack |
| 70 | + |
| 71 | +The following table provides minimal compute recommendations for the monitoring stack. |
| 72 | + |
| 73 | +The default recommendation is to run your Rhize observability stack in the nodes that also run the Rhize application. |
| 74 | +However, some deployments prefer to separate monitoring to its own cluster. |
| 75 | + |
| 76 | +| Service | Pods for HA (replica count) | vCPU cores per pod | Memory per pod | DiskSize (GiB) | |
| 77 | +|-------------------------|-----------------------------|--------------------|----------------|----------------| |
| 78 | +| `grafana` | 3 | 0.5 | 2 | 50GB | |
| 79 | +| `prometheus-node` | 4 | 0.25 | 0.05 | N/A | |
| 80 | +| `prometheus-server` | 1 per pod | 1 | 2 | 1 | |
| 81 | +| `promtail` | 4 | 0.25 | 0.2 | N/A | |
| 82 | +| `loki` | 1 | 1 | 1 | 1 | |
| 83 | +| `loki-logs` | 1 per pod | 0.25 | 0.1 | N/A | |
| 84 | +| `loki-canary` | 4 | 0.25 | 0.1 | N/A | |
| 85 | +| `loki-gateway` | 1 | 0.25 | 0.05 | 0.25 | |
| 86 | +| `loki-grafana-operator` | 1 | 0.25 | 0.1 | 0.25 | |
| 87 | +| `tempo-compactor` | 1 | 0.25 | 2 | 0.25 | |
| 88 | +| `tempo-ingester` | 3 | 0.5 | 0.75 | 1.5 | |
| 89 | +| `tempo-querier` | 1 | 0.25 | 0.5 | 0.25 | |
| 90 | +| `tempo-distributor` | 1 | 0.25 | 0.5 | 0.25 | |
| 91 | +| `tempo-query-frontend` | 1 | 0.25 | 0.5 | 0.25 | |
| 92 | +| `temp-memcache` | 1 | 0.25 | 0.1 | 0.25 | |
| 93 | + |
| 94 | +## Back up |
| 95 | + |
| 96 | +You can [back up Rhize to S3](/deploy/backup/binary/) . |
| 97 | +Consider including an S3 bucket as part of your deployment. |
0 commit comments