diff --git a/src/docs/ocean/_media/configure-automation-rule-auto-attach.png b/src/docs/ocean/_media/configure-automation-rule-auto-attach.png new file mode 100644 index 000000000..6a5028f94 Binary files /dev/null and b/src/docs/ocean/_media/configure-automation-rule-auto-attach.png differ diff --git a/src/docs/ocean/_media/configure-automation-rule-main.png b/src/docs/ocean/_media/configure-automation-rule-main.png new file mode 100644 index 000000000..dbf32af6d Binary files /dev/null and b/src/docs/ocean/_media/configure-automation-rule-main.png differ diff --git a/src/docs/ocean/_media/configure-automation-rule-when-to-apply.png b/src/docs/ocean/_media/configure-automation-rule-when-to-apply.png new file mode 100644 index 000000000..5b6156104 Binary files /dev/null and b/src/docs/ocean/_media/configure-automation-rule-when-to-apply.png differ diff --git a/src/docs/ocean/_media/right-sizing-example-table-ex.png b/src/docs/ocean/_media/right-sizing-example-table-ex.png new file mode 100644 index 000000000..bd7e1a107 Binary files /dev/null and b/src/docs/ocean/_media/right-sizing-example-table-ex.png differ diff --git a/src/docs/ocean/_media/right-sizing-rollbacks-window.png b/src/docs/ocean/_media/right-sizing-rollbacks-window.png new file mode 100644 index 000000000..981cdab80 Binary files /dev/null and b/src/docs/ocean/_media/right-sizing-rollbacks-window.png differ diff --git a/src/docs/ocean/_media/right-sizing-usage-graphs.png b/src/docs/ocean/_media/right-sizing-usage-graphs.png new file mode 100644 index 000000000..ecd1ac3ec Binary files /dev/null and b/src/docs/ocean/_media/right-sizing-usage-graphs.png differ diff --git a/src/docs/ocean/features/ocean-cluster-right-sizing-recom-tab.md b/src/docs/ocean/features/ocean-cluster-right-sizing-recom-tab.md index c44b094a4..537f6c07c 100644 --- a/src/docs/ocean/features/ocean-cluster-right-sizing-recom-tab.md +++ b/src/docs/ocean/features/ocean-cluster-right-sizing-recom-tab.md @@ -1,6 +1,6 @@ # Automatic Right-Sizing Recommendations and Rules -Cloud service provider relevance: EKS and AKS +Cloud service provider relevance: EKS, AKS, and GKE This topic shows you how to view right-sizing recommendations for workloads and containers and work with right-sizing rules. @@ -21,7 +21,8 @@ Your workload optimization activities impact the status of the workloads in the ## Workloads Optimization List - +right-sizing-example-table-ex + This list displays your right-sizing recommendations per workload and lets you drill down per container. * [Right Sizing rules](ocean/features/ocean-cluster-right-sizing-recom-tab?id=automation-rules-list) that are attached to specific workloads. @@ -33,9 +34,9 @@ This list displays your right-sizing recommendations per workload and lets you d * Gray (Rollback): Ocean rolled back to the original deployment request and suspended the workload's attachment to the rule. * Brown (Not Attached): The Workload is not optimized. * Workload type and names. -* vCPU and memory right sizing recommendations per deployment. Recommended increases are shown with a green up arrow, and recommended decreases are shown with a red Down arrow. -* HPA: If the workload is configured with HPA, **ON** is displayed under HPA. Hover over the entry for information about the specific HPA trigger (CPU/Memory/other). -* Potential monthly maximums savings if you adopt the recommendations. +* vCPU and memory right sizing recommendations per deployment. Recommended increases are shown with a green up arrow, and recommended decreases are shown with a red down arrow. +* HPA: If the workload is configured with HPA, **ON** appears under the HPA column. Hover over the entry for information about the specific HPA trigger (CPU/Memory/other). +* Potential maximum monthly savings if you adopt the recommendations. > **Notes**: > - Hover over the Limited and Not optimized statuses to view more details in a tooltip. @@ -53,8 +54,8 @@ To view a list of your potential savings and recommendations per container: * Click on the down arrow to the left of a workload to drill down to the containers. For each container, you can then view the following: - * vCPU Request: showing current and average utilization and a recommended increase or decrease for this resource (in vCPU units). If no changes are required, a Keep icon is displayed. - * Memory Request: This shows current and Average utilization and a recommended increase or decrease for this resource (in MiB units). If no changes are required, a Keep icon is displayed. + * vCPU Request: showing current and average utilization, and a recommended increase or decrease for this resource (in vCPU units). If no changes are required, a Keep icon is displayed. + * Memory Request: This shows current and Average utilization, and a recommended increase or decrease for this resource (in MiB units). If no changes are required, a Keep icon is displayed. * Right-Sizing Recommendations: Show the recommended changes in vCPU and memory. Click on the Copy icon to save these changes for later. ## Automation Rules List @@ -70,39 +71,49 @@ You can create right-sizing rules to trigger immediately after a specific set of ### Create or Edit a Right-Sizing Rule -To create/edit a right-sizing rule: +To create or edit a right-sizing rule: -1. Click the **Advanced Optimization** tab if not already displayed. -2. To create a new rule, click **+ Add new rule** above the Automation Rules list (or to edit an existing rule, click the pencil icon in the rule). +1. Click the **Advanced Optimization** tab, if it is not already displayed. +2. To create a new rule, click **+ Add new rule** above the rules list (or edit an existing rule). - + -3. In the Configure Automation Rule dialog box, enter/edit the unique rule name. -4. Select when to apply the recommendation by selecting one of the following options: +3. In the Configure automation rule dialog box, enter/edit a unique rule name. +4. Select when to apply the recommendation: - * **Once available**: The recommendation is applied immediately after it becomes available. - * **Specific time**: You select when to apply the recommendation after it becomes available. + * **Once available**: Apply the recommendation immediately when it becomes available. + * **Specific time**: Apply the recommendation at a specific time after it becomes available. -![rule-when-to-apply-3](https://github.com/user-attachments/assets/5cb76163-9f33-477e-95d6-b99b36f0f200) +
+ 5. Turn on **Exclude preliminary recommendation** if you want to suppress recommendations as long as the workload has preliminary status (4 days). -6. Select one of the **Restart replicas** options: - * All manifests. - * Manifests with more than 1 replica only. - * No restart. -7. Click the **Set the resources percentage change** down arrow to apply the recommendation, and set the CPU and Memory percentage thresholds. This is the minimum percentage change from the current request for applying a recommendation. If the right-sizing recommendation exceeds the percentage threshold for either resource (CPU or Memory), it will be applied to both resources, and the resulting status will be **fully optimized**. We do this because the original purpose of the threshold is to prevent unnecessary pod deletion. However, if we need to delete a pod and relaunch a new one for one resource, we do the same for the other. -8. Click the **Set recommendation ranges for resources** down arrow and enter the upper and lower boundary values for CPU (millicpu) and Memory (MiB) requests to apply a recommendation. By default, the minimum values are 10 millicpu for CPU and 32 MiB for memory; no lower values will be accepted. +6. Select one of the **Pod modification methods**: + * Update workloads live. + * Restart all attached workloads. + * Restart all attached workloads with more than 1 replica only. + * Update workloads on recreation without triggering a restart. + + >**Note**: if you have Kubernetes 1.33 or above, Ocean can (in most cases) automatically apply the recommendations without having to restart pods. For feature limitations, see the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/#limitations). + +7. Set the recommendation baseline (right-sizing percentile) for the workload to calculate the vCPU and memory recommendations. +The lower the percentile, the stronger the recommendations. By default, both vCPU and memory use the 85th percentile. + +>**Important:** Changing the percentiles will impact any recommendations that were already applied (this may take a few minutes to update) and will also override the values set at cluster level for this workload. + +8. Click **Set the resources percentage change to apply the recommendation** (CPU and memory percentage thresholds). This is the minimum percentage change from the current request for applying a recommendation. If the right-sizing recommendation exceeds the percentage threshold for either resource (CPU or memory), it will be applied to both resources, and the resulting status will be **fully optimized**. We do this because the original purpose of the threshold is to prevent unnecessary pod deletion. However, if we need to delete a pod and relaunch a new one for one resource, we do the same for the other. +9. Click **Set recommendation ranges for resources**, and enter the upper and lower boundary values for CPU (millicpu) and memory (MiB) requests to apply a recommendation. By default, the minimum values are 10 millicpu for CPU and 32 MiB for memory; no lower values will be accepted. * If a recommendation is above the set boundaries, automatic right-sizing will apply the recommendation using the maximum value configured in the rule. * If a recommendation is below the set boundaries, automatic right-sizing will apply the recommendation using the minimum value configured in the rule. -9. Click the **Set overhead for resources** down arrow and set the CPU and memory percentage overheads. An overhead specifies the percentage of extra resources to add to the new request recommendation. -10. Ocean supports automatic right-sizing for HPA-associated workloads. To enable, click **Apply HPA on associated workload**. +11. Click **Set overhead for resources**, and set the CPU and memory percentage overhead. An overhead specifies the percentage of extra resources to add to the new request recommendation. +12. Ocean supports automatic right-sizing for workloads associated with HPA. Click **Apply HPA on associated workload** to enable. -11. Turn on **Auto-attach** if you want to automatically attach rules to workloads based on selected criteria. - * In the Auto-attach area, select required namespaces / labels. +13. Turn on **Auto-attach** if you want to automatically attach rules to workloads based on selected criteria. + * In the Auto-attach area, select required namespaces and / or labels. - + -12. After you save the rule, it appears in the area under the [Workloads Optimization list](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab?id=workloads-optimization-list). +12. A saved rule appears in the area under the [Workloads Optimization list](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab?id=workloads-optimization-list). > **Notes**: > - Default values for Overhead and Automation Threshold are **10%** and **5%** respectively. @@ -124,9 +135,9 @@ To manually attach a rule: 1. Select one or more workloads in the Workloads Optimization list. 2. From the Actions drop-down menu above the table, click **Attach Rule**. -![attach-rule-to-workload](https://github.com/user-attachments/assets/be315afa-0ef8-4d30-b1f3-422e8caf8633) + ![attach-rule-to-workload](https://github.com/user-attachments/assets/be315afa-0ef8-4d30-b1f3-422e8caf8633) -3. You can either attach an existing or new rule you create from scratch (a new rule will be attached to the workload(s) you selected earlier): +3. You can either attach an existing rule or create a new rule from scratch (a new rule will be attached to the workload(s) you selected earlier): * Click the **Select from existing rule** drop-down menu and then select a rule. * Click **Create new rule from scratch** (see [Create or Edit a Right-Sizing Rule](ocean/features/ocean-cluster-right-sizing-recom-tab?id=create-or-edit-a-right-sizing-rule)) @@ -144,10 +155,10 @@ To detach a rule from one or more workloads: ### Delete a Right-Sizing Rule -To delete a right sizing rule: +To delete a right-sizing rule: 1. To the right of the row for the rule in the list, click the wastebasket icon. -2. When the confirmation message appears, Click **Delete**, or **Cancel** (if you are unsure). +2. When the confirmation message appears, click **Delete**, or **Cancel** (if you are unsure). >**Important**: You cannot restore a deleted right-sizing rule. In addition, a rule may be deleted only if it is no longer attached to a workload. @@ -159,9 +170,9 @@ To acknowledge a workload rollback: 1. Click **Acknowledge Rollback** to view all the workloads with the rollback status. -![right-sozomg-rollback-dialog](https://github.com/user-attachments/assets/4bb206f5-73e3-4b26-b7fb-19e5e519505f) + -* The rollback drill-down list contains the following information: +* The rollback list includes the following information: * Workload Name. * Namespace. * CPU Update in vCPUs (before and after rollback). @@ -173,37 +184,29 @@ To acknowledge a workload rollback: The workloads are displayed in the [Workloads Optimization List](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab?id=workloads-optimization-list) without any attached rules. Before attaching a rule to a rolled-back workload, first fix the issue. -### Set the vCPU/Memory Percentile - -You can select the right-sizing percentile settings to calculate the vCPU and memory recommendations. -The lower the percentile, the stronger the recommendations. - -By default: - -* vCPU: Right-sizing uses the 85th percentile. -* Memory: Right-sizing uses the maximum value. - ->**Important:** Changing the percentile setting will impact the recommendations that were already applied (this may take a few minutes to update). +### Set the vCPU/Memory Percentiles at Cluster Level -To change settings: +These are the global percentile settings at the cluster level and apply to all workloads in the cluster. -1. Click **Settings** above the [workloads optimization list](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab?id=workloads-optimization-list). +Any percentile change you make for a specific workload in a right-sizing rule overrides the setting at cluster level. - +To change the settings: -2. Click the arrow on the right for **vCPU** or **Memory** as required (vCPU shown in the example). +1. Click **Cluster Settings** above the [workloads optimization list](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab?id=workloads-optimization-list). + +2. Click **vCPU** or **Memory** as required (vCPU is shown in the example). - + 3. Change the current value(s) and save. ## Best Practices -These are the Right-Sizing Best Practices: +These are the right-sizing best practices: * Workload limits should not have the same values as requests. * If you set overheads for resources, start with a relatively high overhead (20%) and decrease it with time. -* If you set boundaries (recommendation ranges for resources), avoid applying the specific rule to all workloads. All services have different purposes. +* If you set boundaries (recommendation ranges for resources), avoid applying the same rule to all workloads. All services have different purposes. ## Related Topics diff --git a/src/docs/ocean/features/ocean-cluster-right-sizing-savings-tab.md b/src/docs/ocean/features/ocean-cluster-right-sizing-savings-tab.md index 62367a34e..ea0b71339 100644 --- a/src/docs/ocean/features/ocean-cluster-right-sizing-savings-tab.md +++ b/src/docs/ocean/features/ocean-cluster-right-sizing-savings-tab.md @@ -1,6 +1,6 @@ # Automatic Right-Sizing Actual Savings -Cloud service provider relevance: EKS and AKS +Cloud service provider relevance: EKS, AKS, and GKE This topic shows you how to view your (actual) right-sizing savings from applying down-sizing recommendations to your workloads. diff --git a/src/docs/ocean/features/ocean-cluster-right-sizing-tab.md b/src/docs/ocean/features/ocean-cluster-right-sizing-tab.md index eeef69818..edf028a20 100644 --- a/src/docs/ocean/features/ocean-cluster-right-sizing-tab.md +++ b/src/docs/ocean/features/ocean-cluster-right-sizing-tab.md @@ -1,6 +1,6 @@ # Ocean Cluster Automatic Right Sizing -Cloud service provider relevance: EKS and AKS +Cloud service provider relevance: EKS, AKS, and GKE To help you improve the efficiency and performance of your cloud environments, Ocean’s rightsizing capabilities provide recommendations that target over-provisioning and underutilization. @@ -16,20 +16,23 @@ To opt-in and turn on the full capabilities of this powerful feature, [Contact S ## Prerequisites -Before you attempt to fine-tune your cluster resources according to Ocean's recommendation, you will need: +Before you attempt to fine-tune your cluster resources according to Ocean recommendations, you will need: * A Spot account. -* Ocean cluster managing your Kubernetes worker nodes. +* Ocean cluster managing your Kubernetes worker nodes. * [Ocean Controller Version 2.0.52 and above](https://docs.spot.io/ocean/tutorials/ocean-controller-v2/) installed and running. * Make sure to install the [Metrics Server](https://github.com/kubernetes-incubator/metrics-server#deployment). -* Vertical Pod Autoscaler project (VPA) Version 1.0.0 and above installed on your cluster. If the VPA is not already running on your cluster, run the following helm commands: -```sh +* Kubernetes 1.33 and above for the option to apply automatic recommendations without having to restart pods. See [How it works](link TBD). +* Vertical Pod Autoscaler project (VPA) 1.4.1. If you need to upgrade, see [Upgrade VPA](link TBD). If the VPA is not already running on your cluster, run the following helm commands: + + ```sh. + + helm repo add spot https://charts.spot.io + helm repo update + helm install spot/ocean-vpa + ``` -helm repo add spot https://charts.spot.io -helm repo update -helm install spot/ocean-vpa -``` >**Note**: To turn on automatic right-sizing, contact your [support](https://spot.io/support/) team via email or chat. ## How It Works @@ -44,6 +47,12 @@ The output produces a single point-in-time data point for each pod. Ocean then a Using the per-workload container aggregated data points, Ocean makes recommendations based on a mechanism that attempts to even out peaks and troughs in resource demand. The Right-Sizing engine runs every hour to generate new recommendations and update existing ones. +
+ +Ocean can automatically apply these recommendations to your workloads according to your requirements. + +If you have Kubernetes 1.33 or above (see Prerequisites), Ocean provides the option to change the CPU / memory allocation of container(s) within a running Pod while potentially avoiding application disruption. As such, Ocean automatically applies its recommendations without restarting the pods. This feature is subject to [Kubernetes limitations](https://kubernetes.io/docs/tasks/configure-pod-container/resize-container-resources/#limitations). + Recommendations for decreasing and increasing memory or CPU requests are based on the percentile defined for the cluster (the default is the 85th percentile). Ocean handles the right-sizing workload limits as follows: @@ -110,7 +119,7 @@ This panel contains two widgets: * vCPU usage in the last 2 weeks: Displays graphs for used, allocated, and recommended vCPU usage based on data from the last 2 weeks. * Memory usage in the last 2 weeks: Displays graphs for used, allocated, and recommended memory usage based on data from the last 2 weeks. - + Hover over a data point in the **vCPU usage in the last 2 weeks** widget to view usage details: >**Note**: The default **85th percentile vCPU usage** and **Maximum memory usage** options are used to calculate the right-sizing recommendations for all usage parameters. @@ -123,16 +132,21 @@ Hover over a data point in the **vCPU usage in the last 2 weeks** widget to view * Average vCPU usage * Suggested vCPU usage based on data from the last 2 weeks. Hover over a data point in the **Memory usage in the last 2 weeks** widget to view: + * Allocated memory usage in GiB based on data from the last 2 weeks. * Actual memory usage in GiB based on data from the last 2 weeks (you can change the default from the **Usage drop-down menu**). - * Maximum memory usage in GiB (**default**) + * 85th percentile vCPU usage (**default**) + * 95th percentile vCPU usage + * 90th percentile vCPU usage + * Maximum memory usage in GiB * Average memory usage in GiB * Suggested memory usage in GiB based on data from the last 2 weeks. ## Related Topics * [Right-Sizing Troubleshooting](https://docs.spot.io/ocean/features/troubleshoot-right-sizing) -* [Right-Sizing Rules and Reommendations](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab) +* [Right-Sizing Rules and Recommendations](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-recom-tab) +* [Right-Sizing Savings Panel](https://docs.spot.io/ocean/features/ocean-cluster-right-sizing-savings-tab) diff --git a/src/docs/ocean/features/troubleshoot-right-sizing.md b/src/docs/ocean/features/troubleshoot-right-sizing.md index 29a2e9462..44eb4f3d0 100644 --- a/src/docs/ocean/features/troubleshoot-right-sizing.md +++ b/src/docs/ocean/features/troubleshoot-right-sizing.md @@ -1,6 +1,6 @@ # Automatic Right-Sizing-Troubleshooting -Cloud service provider relevance: EKS, AKS +Cloud service provider relevance: EKS, AKS, and GKE ## VPA not reporting message appears at the top of the right-sizing page diff --git a/src/docs/ocean/features/update-vertical-pod-autoscaler-project.md b/src/docs/ocean/features/update-vertical-pod-autoscaler-project.md new file mode 100644 index 000000000..4c8059052 --- /dev/null +++ b/src/docs/ocean/features/update-vertical-pod-autoscaler-project.md @@ -0,0 +1,27 @@ +# Update the Vertical Pod Autoscaler Project (VPA) + +To use Ocean's automatic right-sizing feature, you need Vertical Pod Autoscaler project (VPA) 1.4.1 or above installed on your cluster. + +if you need to upgrade VPA, follow these instructions: + +1. Update your local Helm chart repository cache. + + ```sh + helm repo update + ``` + +2. Run this command to update the `ocean-vpa change ` and `` fields according to your cluster. + + ```sh + helm upgrade --install --wait spot/ocean-vpa \ + --namespace + ``` + +3. Update the `vpa crd` with the latest version. + + ```sh + kubectl apply -f https://raw.githubusercontent.com/spotinst/charts/refs/tags/main/charts/ocean-vpa/crds/vpa-crd.yaml + ``` + + +