index.json

[
{
	"uri": "/",
	"title": "Overview",
	"tags": [],
	"description": "",
	"content": "Cost-optimize for stateless workloads on EKS by leveraging Spot instance, Kubernetes built-in features and Karpenter Overall In this workshop, we will explore practical strategies for optimizing the cost of running workloads on Amazon EKS. By leveraging Spot Instances and the power of Karpenter, an open-source Kubernetes cluster autoscaler, you can significantly reduce your infrastructure costs without sacrificing performance.\nAdditionally, we will cover some native Kubernetes features that can increase availability, handle problem that Spot instance create - Spot instance interruption. Whether you are new to EKS or looking to fine-tune your cluster management, this session will provide you with valuable insights and hands-on experience to maximize your cloud savings.\nArchitecture overview This is the architecture diagram: Content Introduction Prerequisite Configure Karpenter Pod topology spread constraints Pod disruption budgets Demo interruption Clean up resources "
},
{
	"uri": "/2-prerequiste/2.1-awscli/",
	"title": "AWS CLI",
	"tags": [],
	"description": "",
	"content": "Before configure aws cli, you must have an IAM user that has these permissions:\nAWS service Access level EKS Full Access CloudFormation Full Access EC2 Full Access IAM Limited: List, Read, Write, Permission Management Install aws cli Step 1: Update Apt Repository For installing the AWS CLI on Ubuntu 24.04 using the pip installer, first update the Apt repository using the following command:\nsudo apt-get update Step 2: Install Pip Installer The Python provides the pip installer that can be installed via the following command:\nsudo apt-get install python3-pip Step 3: Install AWS CLI The following command will install the AWS CLI using the pip 3 installer. The “–break-system-packages” prevents the “externally managed environment” error and ensures the smooth installation:\npip3 install awscli --break-system-packages Step 4: Verify Installation Use the following command to determine which version of AWS CLI is installed on the system:\npip show awscli Configure aws cli: Copy IAM user’s access key and secret access key that we have created, run this command:\naws configure Then paste those contents\n"
},
{
	"uri": "/1-introduce/",
	"title": "Introduction",
	"tags": [],
	"description": "",
	"content": "Objective In today’s cloud-driven landscape, cost optimization is a critical aspect of managing scalable and efficient Kubernetes workloads. This workshop focuses on using Karpenter to optimize costs for Amazon Elastic Kubernetes Service (EKS) by leveraging Spot Instances. By the end of this workshop, you will be able to:\nInstall and configure Karpenter Integrate Kubernetes features that help solve interruption problem caused by EC2 Spot instances Prerequisite knowledges Spot instances AWS EC2 Spot Instances provide a cost-effective way to run workloads in the cloud by allowing users to take advantage of unused EC2 capacity at a significantly lower price. With Spot Instances, you can save up to 90% compared to On-Demand pricing. However, these instances can be interrupted by AWS when they are needed by On-Demand users. This makes Spot Instances ideal for stateless, fault-tolerant, and flexible applications such as big data processing, containerized workloads, CI/CD pipelines, and more. By using Spot Instances effectively, you can dramatically reduce your cloud computing costs while still maintaining high performance.\nKarpenter Karpenter is an open-source, flexible, and efficient Kubernetes cluster autoscaler designed to optimize the scaling of your workloads in the cloud. Unlike traditional autoscalers, Karpenter dynamically provisions just the right compute resources by observing real-time usage patterns and intelligently launching or terminating nodes as needed. It supports a wide range of compute options, including Spot Instances, which helps reduce cloud costs significantly.\nKarpenter also integrates seamlessly with Kubernetes, enabling faster scaling decisions and better utilization of resources. This makes it ideal for modern, dynamic cloud-native applications that require efficient and cost-effective scaling.\nKubernetes features Kubernetes itself is a cloud-native tool, it supports two important features to increase availability so that we will leverage them to handle spot interruption problem caused by Spot instance:\nPod topology spread constraints. Pod disruption budgets. Details for these two features are described in later sections.\nArchitecture overview In this workshop, we will:\nDeploy an EKS cluster Create SQS queue and related events Configure Karpenter Leverage Kubernetes features "
},
{
	"uri": "/2-prerequiste/2.2-othercli/",
	"title": "Other CLI",
	"tags": [],
	"description": "",
	"content": "Install kubectl Update the apt package index and install packages needed to use the Kubernetes apt repository:\nsudo apt-get update sudo apt-get install -y apt-transport-https ca-certificates curl gnupg Download the public signing key for the Kubernetes package repositories. The same signing key is used for all repositories so you can disregard the version in the URL:\ncurl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg sudo chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg Add the appropriate Kubernetes apt repository. If you want to use Kubernetes version different than v1.30, replace v1.30 with the desired minor version in the command below:\necho \u0026#39;deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /\u0026#39; | sudo tee /etc/apt/sources.list.d/kubernetes.list sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list Update apt package index, then install kubectl:\nsudo apt-get update sudo apt-get install -y kubectl Install eksctl Run following commands to install eksctl:\n# for ARM systems, set ARCH to: `arm64`, `armv6` or `armv7` ARCH=amd64 PLATFORM=$(uname -s)_$ARCH curl -sLO \u0026#34;https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_$PLATFORM.tar.gz\u0026#34; # (Optional) Verify checksum curl -sL \u0026#34;https://github.com/eksctl-io/eksctl/releases/latest/download/eksctl_checksums.txt\u0026#34; | grep $PLATFORM | sha256sum --check tar -xzf eksctl_$PLATFORM.tar.gz -C /tmp \u0026amp;\u0026amp; rm eksctl_$PLATFORM.tar.gz sudo mv /tmp/eksctl /usr/local/bin Install Helm cli Members of the Helm community have contributed a Helm package for Apt. This package is generally up to date. Run following commands to install Helm:\ncurl https://baltocdn.com/helm/signing.asc | gpg --dearmor | sudo tee /usr/share/keyrings/helm.gpg \u0026gt; /dev/null sudo apt-get install apt-transport-https --yes echo \u0026#34;deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main\u0026#34; | sudo tee /etc/apt/sources.list.d/helm-stable-debian.list sudo apt-get update sudo apt-get install helm "
},
{
	"uri": "/2-prerequiste/",
	"title": "Preparation",
	"tags": [],
	"description": "",
	"content": "Overview In this section, you will do following tasks: Install kubectl on your local machine Install aws cli Install eksctl on your local machine Install Helm cli on your local machine\nIn this workshop, these cli tools are installed on Ubuntu 24.04\nAWS CLI The AWS Command Line Interface (AWS CLI) is an open source tool from Amazon Web Services (AWS). You can use it to interact with AWS services using commands in your command line shell.\nWith minimal configuration, you can use the AWS CLI to commands that implement functionality equivalent to that provided by the browser-based AWS Management Console.\nkubectl The Kubernetes command-line tool, kubectl, allows you to run commands against Kubernetes clusters. You can use kubectl to deploy applications, inspect and manage cluster resources, and view logs.\neksctl Eksctl is a user-friendly command-line tool that makes it easy to create and manage EKS clusters on AWS.\nHelm Helm is a package manager for Kubernetes that simplifies the deployment and management of applications. By using Helm, you can package all the Kubernetes resources required to run an application into a single chart, making it easy to deploy and manage applications across different environments.\nContent AWS CLI Other CLI "
},
{
	"uri": "/3-karpenter/",
	"title": "Configure Karpenter",
	"tags": [],
	"description": "",
	"content": "Overview In this section, we will create an EKS cluster and install Karpenter on it using CLI.\nAt the time this workshop is written, the latest Kubernetes stable version is 1.30 and latest Karpenter version is 0.37. In the future, there will be different versions.\nEnvironment variables Firstly, we must export some environment variables for later using:\nexport KARPENTER_NAMESPACE=\u0026#34;kube-system\u0026#34; export KARPENTER_VERSION=\u0026#34;0.37.0\u0026#34; export K8S_VERSION=\u0026#34;1.30\u0026#34; export AWS_PARTITION=\u0026#34;aws\u0026#34; # if you are not using standard partitions, you may need to configure to aws-cn / aws-us-gov export CLUSTER_NAME=\u0026#34;${USER}-karpenter-demo\u0026#34; export AWS_DEFAULT_REGION=\u0026#34;us-west-2\u0026#34; export AWS_ACCOUNT_ID=\u0026#34;$(aws sts get-caller-identity --query Account --output text)\u0026#34; export TEMPOUT=\u0026#34;$(mktemp)\u0026#34; export AMD_AMI_ID=\u0026#34;$(aws ssm get-parameter --name /aws/service/eks/optimized-ami/${K8S_VERSION}/amazon-linux-2/recommended/image_id --query Parameter.Value --output text)\u0026#34; Karpenter\u0026rsquo;s prerequisites By default, Karpenter needs a SQS queue to listen for spot-interruption events and policy to access EC2 API. We will deploy a SQS queue, required event rules and required role for Karpenter controller using CloudFormation:\ncurl -fsSL https://raw.githubusercontent.com/aws/karpenter-provider-aws/v\u0026#34;${KARPENTER_VERSION}\u0026#34;/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml \u0026gt; \u0026#34;${TEMPOUT}\u0026#34; \\ \u0026amp;\u0026amp; aws cloudformation deploy \\ --stack-name \u0026#34;Karpenter-${CLUSTER_NAME}\u0026#34; \\ --template-file \u0026#34;${TEMPOUT}\u0026#34; \\ --capabilities CAPABILITY_NAMED_IAM \\ --parameter-overrides \u0026#34;ClusterName=${CLUSTER_NAME}\u0026#34; Creating EKS cluster In the upcoming step, we will create an EKS cluster, install eks-pod-identity-agent add-ons for the cluster, add a service account for later used by Karpenter controller and create a managed node group at the same time by running this command:\neksctl create cluster -f - \u0026lt;\u0026lt;EOF --- apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: ${CLUSTER_NAME} region: ${AWS_DEFAULT_REGION} version: \u0026#34;${K8S_VERSION}\u0026#34; tags: karpenter.sh/discovery: ${CLUSTER_NAME} iam: withOIDC: true podIdentityAssociations: - namespace: \u0026#34;${KARPENTER_NAMESPACE}\u0026#34; serviceAccountName: karpenter roleName: ${CLUSTER_NAME}-karpenter permissionPolicyARNs: - arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:policy/KarpenterControllerPolicy-${CLUSTER_NAME} iamIdentityMappings: - arn: \u0026#34;arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER_NAME}\u0026#34; username: system:node:{{EC2PrivateDNSName}} groups: - system:bootstrappers - system:nodes managedNodeGroups: - instanceType: m5.large amiFamily: AmazonLinux2 name: ${CLUSTER_NAME}-ng desiredCapacity: 2 minSize: 1 maxSize: 10 addons: - name: eks-pod-identity-agent EOF Install Karpenter Now, we can simply deploy Karpenter on our cluster by using Helm:\nhelm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version \u0026#34;${KARPENTER_VERSION}\u0026#34; --namespace \u0026#34;${KARPENTER_NAMESPACE}\u0026#34; --create-namespace \\ --set \u0026#34;settings.clusterName=${CLUSTER_NAME}\u0026#34; \\ --set \u0026#34;settings.interruptionQueue=${CLUSTER_NAME}\u0026#34; \\ --set controller.resources.requests.cpu=1 \\ --set controller.resources.requests.memory=1Gi \\ --set controller.resources.limits.cpu=1 \\ --set controller.resources.limits.memory=1Gi \\ --wait Configure Karpenter When using Helm to deploy Karpenter, we have created two custom resources which are NodePool and NodeClass. To make our Karpenter controller operate correctly, you must create at least a NodePool and a NodeClass.\nA NodePool in Karpenter refers to a collection of nodes that share similar configuration attributes, such as instance types, sizes, or labels. When Karpenter needs to scale up the cluster, it creates new nodes in the specified NodePool based on the current demand from pods needing resources. This flexibility allows for more efficient resource management and helps optimize costs by ensuring that only the required amount of compute resources is provisioned.\nNodeClass is a resource being referenced by NodePool, helps enable configuration of AWS specific settings, such as the maximum pods per node, the subnet that node will be created, security group for node, etc…\nIn this workshop, we use Spot instance for cost saving and luckily, by default Karpenter will always prioritize Spot instance over On-demand instance. Even though with Karpenter, it can fastly replace reclaimed nodes with new nodes, if you want to decrease the probability of Spot instance interruption, you can visit this website and choose the instance types that have the least frequency of interruption: Amazon EC2 Spot Instance Advisor.\nHere I create our default NodePool and a NodeClass by running these commands:\ncat \u0026lt;\u0026lt;EOF | envsubst | kubectl apply -f - apiVersion: karpenter.sh/v1beta1 kind: NodePool metadata: name: default spec: template: spec: requirements: - key: kubernetes.io/arch operator: In values: [\u0026#34;amd64\u0026#34;] - key: kubernetes.io/os operator: In values: [\u0026#34;linux\u0026#34;] - key: karpenter.sh/capacity-type operator: In values: [\u0026#34;spot\u0026#34;, \u0026#34;on-demand\u0026#34;] - key: karpenter.k8s.aws/instance-category operator: In values: [\u0026#34;c\u0026#34;, \u0026#34;m\u0026#34;, \u0026#34;r\u0026#34;] - key: karpenter.k8s.aws/instance-generation operator: Gt values: [\u0026#34;2\u0026#34;] nodeClassRef: apiVersion: karpenter.k8s.aws/v1beta1 kind: EC2NodeClass name: default limits: cpu: 1000 disruption: consolidationPolicy: WhenUnderutilized expireAfter: 720h # 30 * 24h = 720h --- apiVersion: karpenter.k8s.aws/v1beta1 kind: EC2NodeClass metadata: name: default spec: amiFamily: AL2 # Amazon Linux 2 role: \u0026#34;KarpenterNodeRole-${CLUSTER_NAME}\u0026#34; # replace with your cluster name subnetSelectorTerms: - tags: karpenter.sh/discovery: \u0026#34;${CLUSTER_NAME}\u0026#34; # replace with your cluster name securityGroupSelectorTerms: - tags: karpenter.sh/discovery: \u0026#34;${CLUSTER_NAME}\u0026#34; # replace with your cluster name amiSelectorTerms: - id: \u0026#34;${ARM_AMI_ID}\u0026#34; - id: \u0026#34;${AMD_AMI_ID}\u0026#34; EOF "
},
{
	"uri": "/4-topology_constraint/",
	"title": "Pod topology spread constraints",
	"tags": [],
	"description": "",
	"content": "What is Pod Topology Spread Constraints? Pod topology spread constraint is a built-in feature from Kubernetes, it brings us the ability to control how pods inside a deployment can spread across our cluster among failure-domains such as zones, nodes,.. This can help us increase our cluster’s availability.\nIn this section, I will show you how to spread pods in a deployment across availability zones like in this diagram:\nWhy use Pod Topology Spread Constraints? Without proper control over Pod placement, Kubernetes might schedule multiple Pods of the same application on the same node or within the same Availability Zone. This clustering can lead to a single point of failure, making your application vulnerable to disruptions if that node or zone becomes unavailable - a common situation with Spot instance\nHow to configure Pod topology spread constraints? In this workshop, I will demo how to leverage this feature to spread our pods across availability zones, because our cluster is created on us-west-2 region which has 3 availability zones, our deployment must be deployed on at least three nodes.\nCurrently, our cluster has only two nodes in the managed node group. We’ll define constraints for a deployment that ensure our Pods are distributed across all three zones.\nHere\u0026rsquo;s an example of how to define topology spread constraints in a Kubernetes Pod specification:\ncat \u0026lt;\u0026lt; EOF | kubectl apply -f apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 8 template: metadata: labels: app: nginx spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: nginx containers: - name: nginx image: nginx:1.14.2 resources: requests: memory: \u0026#34;1024Mi\u0026#34; cpu: \u0026#34;500m\u0026#34; EOF In this example:\nmaxSkew: 1 ensures that the number of Pods in any zone will differ by no more than one. topologyKey: topology.kubernetes.io/zone specifies that the Pods should be distributed across different Availability Zones. whenUnsatisfiable: DoNotSchedule tells Kubernetes not to schedule Pods if the constraint cannot be satisfied, avoiding overloading a single zone. Because this field is set to DoNotSchedule, when this constraint is unsatisfied, there will be pods that are Pending, trigger Karpenter to create new nodes in another region. labelSelector: we will set this to the pod\u0026rsquo;s label, so that all the pods in this deployment can realize each other. When a pod is spread across availability zones, although the likelihood is very low, there is still a possibility that all nodes in the zones could face interruptions. If you want to ensure the application is always available, you might consider deploying a hybrid model — meaning spreading the pods across two capacity types: on-demand and spot by using another value for topologyKey - karpenter.sh/capacity-type.\nHere\u0026rsquo;s an example that spread pods across capacity type:\ncat \u0026lt;\u0026lt; EOF | kubectl apply -f apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 8 template: metadata: labels: app: nginx spec: topologySpreadConstraints: - maxSkew: 1 topologyKey: karpenter.sh/capacity-type whenUnsatisfiable: DoNotSchedule labelSelector: matchLabels: app: nginx containers: - name: nginx image: nginx:1.14.2 resources: requests: memory: \u0026#34;1024Mi\u0026#34; cpu: \u0026#34;500m\u0026#34; EOF "
},
{
	"uri": "/5-pod-db/",
	"title": "Pod disruption budget",
	"tags": [],
	"description": "",
	"content": "What is Pod disruption budget? Pod disruption budget is a built-in feature that Kubernetes provides, helping us run higher available applications. Pod Disruption Budgets (PDBs) allow you to control the impact of voluntary disruptions, such as node updates, scaling events, or cluster upgrades, by limiting how many Pods of a particular application can be down at any given time.\nWhy we need PDB? Without a PDB in place, Kubernetes might inadvertently terminate too many Pods of a critical application during node maintenance or scaling operations, causing downtime or service degradation. PDBs help you maintain a minimum level of availability during planned disruptions by enforcing limits on Pod evictions. This is especially important when using Spot Instances, where Pods may be disrupted more frequently due to the nature of spot pricing.\nWhen using Karpenter to dynamically scale your Amazon EKS cluster, Pod Disruption Budgets play a key role in maintaining application availability during scaling events. Karpenter respects PDBs when scaling down nodes, ensuring that your disruption limits are honored.\nPod Disruption Budgets provide the following benefits:\nMaintained Application Availability: Ensures that a specified number of Pods remain available during voluntary disruptions. Controlled Evictions: Limits the number of Pods that can be evicted simultaneously during cluster maintenance or scaling. Improved Reliability: Helps protect workloads from unexpected availability drops caused by too many concurrent Pod evictions. Configuring Pod Disruption Budgets In this section, we’ll learn how to define and configure a Pod Disruption Budget to protect your applications from excessive downtime during disruptions. A PDB can be configured by specifying either a minimum number of available Pods or a maximum number of unavailable Pods at any given time.\nHere’s an example of a PDB configuration:\napiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: nginx spec: minAvailable: 3 selector: matchLabels: app: nginx In this example:\nminAvailable: 3 ensures that at least 3 Pods remain available during any voluntary disruption (e.g., node maintenance or scaling). The selector field matches Pods labeled with app: example, applying the PDB to those Pods. Alternatively, you could specify maxUnavailable if you prefer to define the maximum number of Pods that can be disrupted:\napiVersion: policy/v1 kind: PodDisruptionBudget metadata: name: example-pdb spec: maxUnavailable: 1 selector: matchLabels: app: example Later, you can easily deploy it like other Kubernetes resources:\nkubectl apply -f pdb.yaml "
},
{
	"uri": "/6-demo/",
	"title": "Demo interruption",
	"tags": [],
	"description": "",
	"content": "Overview In this section, we will simulate Spot instance interruption with FIS, then verify that our application is still serving request and see how Karpenter recover interruption node.\nPrerequisite In previous section, we had deployed a nginx deployment that spread accross multiple zones, but that is still not enough, we need to deployed a service with type LoadBalancer. You can use the following manifest to deploy that service:\napiVersion: v1 kind: Service metadata: name: my-service spec: selector: app: nginx ports: - protocol: TCP port: 80 targetPort: 9376 clusterIP: 10.0.171.239 type: LoadBalancer Demo interruption process Cluster\u0026rsquo;s current state Currently, our application is deployed on 3 Spot instances:\nThis tool I\u0026rsquo;m using to monitor EKS cluster is eks-node-viewer\nOur application now is basically Nginx\u0026rsquo;s default page: Node interruption Navigate to EC2\u0026rsquo;s console, select Spot requests under Instances section: Select a node, then click on Action -\u0026gt; Initiate interruption: For the first time, we need to create a default role for FIS, so click on Default role and select Initiate interruption: That node will be terminated: Very quickly, another node will be spinning up and unscheduled pods will be deployed: During that process, because our pods are spread accross nodes, so it is still available: "
},
{
	"uri": "/7-cleanup/",
	"title": "Clean up resources",
	"tags": [],
	"description": "",
	"content": "Clean up resources Remove Karpenter and delete EKS cluster by running this command: helm uninstall karpenter --namespace \u0026#34;${KARPENTER_NAMESPACE}\u0026#34; eksctl delete cluster --name \u0026#34;${CLUSTER_NAME}\u0026#34; Delete the SQS queue and required IAM roles with this command: aws cloudformation delete-stack --stack-name \u0026#34;Karpenter-${CLUSTER_NAME}\u0026#34; aws ec2 describe-launch-templates --filters \u0026#34;Name=tag:karpenter.k8s.aws/cluster,Values=${CLUSTER_NAME}\u0026#34; | jq -r \u0026#34;.LaunchTemplates[].LaunchTemplateName\u0026#34; | xargs -I{} aws ec2 delete-launch-template --launch-template-name {} "
},
{
	"uri": "/categories/",
	"title": "Categories",
	"tags": [],
	"description": "",
	"content": ""
},
{
	"uri": "/tags/",
	"title": "Tags",
	"tags": [],
	"description": "",
	"content": ""
}]