HowTo Install and Use Karpenter in EKS
Karpenter is an open-source Kubernetes node autoscaler created by AWS, designed to improve efficiency and cost savings by provisioning and de-provisioning nodes dynamically based on workloads. I have been using karpenter since its first few versions when it was quite limited in features but after the v1.0 release I believe it has become the major player in this field. This article is intended for folks who have traditionally used cluster autoscaler to show how to install and do basic operations with karpenter.
But first, why Karpenter over Cluster Autoscaler?
Cluster Autoscaler (CA) works by scaling node groups based on pod scheduling failures, but it has a few limitations:
- Tied to ASGs: CA relies on AWS Auto Scaling Groups (ASGs), making scaling decisions slower.
- Fixed node types: Nodes in CA are predefined, limiting flexibility.
- Inefficient resource allocation: CA does not dynamically optimize instance selection, potentially leading to wasted resources.
Advantages of Karpenter:
- Direct EC2 instance provisioning: Karpenter does not require ASGs and provisions instances directly through EC2.
- Faster scaling: Unlike CA, Karpenter reacts immediately to pending pods, minimizing scheduling delays.
- Flexible instance selection: It chooses the most efficient EC2 instance type based on workload requirements.
- Automatic node termination: When nodes are idle, Karpenter deprovisions them automatically, reducing costs.
- Better spot instance utilization: Karpenter supports mixed instance types and Spot Instances more effectively.
How Karpenter Works
Karpenter listens for unscheduled pods and provisions the best-fitting compute capacity in real-time:
- Pod watch: It monitors the Kubernetes API for unscheduled pods.
- Instance selection: Karpenter selects the most cost-effective instance type based on constraints and requirements.
- Instance provisioning: It launches the instance directly using the EC2 API.
- Node registration: The new node joins the cluster, and Karpenter binds pods to it.
- Node de-provisioning: When nodes become unnecessary, Karpenter automatically removes them.
Next, let’s look at how to install and use karpenter; the first decision is where to install it: in its own node group or using Fargate; hopefully AWS will have soon a managed solution for karpenter so we don’t have to worry about this at all. I like to use the fargate solution as it is simple and it doesn’t require a node-group just for karpenter.
Install Karpenter on AWS Fargate
Karpenter can be installed on an AWS Fargate node to manage compute resources dynamically. Follow these steps to set it up:
Create a Fargate Profile for Karpenter
Ensure that your EKS cluster has a Fargate profile for the karpenter namespace:
aws eks create-fargate-profile --cluster-name my-cluster --fargate-profile-name karpenter-profile \
--pod-execution-role-arn arn:aws:iam::ACCOUNT_ID:role/AmazonEKSFargatePodExecutionRole \
--selectors namespace=karpenter
Create a Custom values.yaml File
Create a file named karpenter-values.yaml with the following content:
serviceAccount:
create: true
name: karpenter
controller:
clusterName: my-cluster
aws:
defaultInstanceProfile: KarpenterInstanceProfile
interruptionQueue: karpenter-interruption-queue
nodeSelector:
eks.amazonaws.com/fargate-profile: karpenter-profile
settings:
consolidation:
enabled: true
ttlSecondsAfterEmpty: 300
Replace my-cluster with your EKS cluster name and karpenter-profile with your Fargate profile name.
Install Karpenter Using Helm with the Custom values.yaml
Add the Karpenter Helm repository:
helm repo add karpenter https://charts.karpenter.sh
helm repo update
Install Karpenter using the custom values.yaml file:
helm install karpenter karpenter/karpenter --namespace karpenter --create-namespace -f karpenter-values.yaml
Customizing Bin Packing Behavior
There are many values that can be set to control how karpenter works (see the helm defaults for all of those); here I just configured some to enable consolidation and how long to wait to delete an idle node. Customize them per your needs.
- settings.consolidation.enabled: Default is false. When set to true, Karpenter bin packs workloads by terminating underutilized nodes and rescheduling pods.
- settings.ttlSecondsAfterEmpty: Default is 30. This defines how long an empty node should remain before being deprovisioned.
- settings.limits.resources.cpu: No default limit. This should be set based on your cluster’s resource constraints.
Once you have it installed it will start doing its thing; watching k8s events and taking action as needed.
Troubleshooting Karpenter
To diagnose scaling issues, monitor the following logs: Check Karpenter logs:
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
Inspect the provisioner status:
kubectl get provisioners -o wide
Check pending pods:
kubectl get pods --field-selector=status.phase=Pending -A
Inspect event logs for scheduling failures:
kubectl describe pod <POD_NAME>
Check for insufficient capacity errors:
kubectl get events --sort-by=.metadata.creationTimestamp | grep -i karpenter
That’s it. You have successfully installed and configured Karpenter in your Amazon EKS cluster. With Karpenter, your cluster will automatically scale nodes dynamically based on workload demand, improving cost efficiency and performance.