Karpenter Deep Dive: Node Provisioning That Actually Works
Cluster Autoscaler is slow. It works with node groups and takes minutes to scale. Karpenter provisions nodes in seconds, picks the right instance types, and consolidates aggressively.
TL;DR
- Karpenter = fast, flexible node provisioning
- Provisions in ~60 seconds (vs 3-5 min for CA)
- Automatic instance type selection
- Built-in consolidation and spot handling
- Works with EKS, coming to other clouds
Install Karpenter
# Set variables
export KARPENTER_VERSION=v0.33.0
export CLUSTER_NAME=production
export AWS_REGION=eu-west-2
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
# Create IAM resources
aws cloudformation deploy \
--stack-name Karpenter-${CLUSTER_NAME} \
--template-file karpenter-cloudformation.yaml \
--capabilities CAPABILITY_NAMED_IAM \
--parameter-overrides ClusterName=${CLUSTER_NAME}
# Install Karpenter
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
--version ${KARPENTER_VERSION} \
--namespace karpenter --create-namespace \
--set settings.clusterName=${CLUSTER_NAME} \
--set settings.clusterEndpoint=$(aws eks describe-cluster --name ${CLUSTER_NAME} --query "cluster.endpoint" --output text) \
--set serviceAccount.annotations."eks\.amazonaws\.com/role-arn"=arn:aws:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER_NAME}
NodePool Configuration
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
# Instance categories
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
# Instance sizes
- key: karpenter.k8s.aws/instance-size
operator: In
values: ["medium", "large", "xlarge", "2xlarge"]
# Architectures
- key: kubernetes.io/arch
operator: In
values: ["amd64", "arm64"]
# Capacity types (spot + on-demand)
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
# Availability zones
- key: topology.kubernetes.io/zone
operator: In
values: ["eu-west-2a", "eu-west-2b", "eu-west-2c"]
nodeClassRef:
name: default
# Limits
limits:
cpu: 1000
memory: 2000Gi
# Disruption settings
disruption:
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
budgets:
- nodes: "10%"
EC2NodeClass
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
name: default
spec:
# AMI selection
amiFamily: AL2
# Or specific AMI
# amiSelectorTerms:
# - id: ami-0123456789abcdef0
# Subnets
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: ${CLUSTER_NAME}
# Security groups
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: ${CLUSTER_NAME}
# Instance profile
instanceProfile: KarpenterNodeInstanceProfile-${CLUSTER_NAME}
# Block device mappings
blockDeviceMappings:
- deviceName: /dev/xvda
ebs:
volumeSize: 100Gi
volumeType: gp3
iops: 3000
throughput: 125
encrypted: true
# User data
userData: |
#!/bin/bash
/etc/eks/bootstrap.sh ${CLUSTER_NAME} \
--container-runtime containerd
# Tags for instances
tags:
Environment: production
ManagedBy: karpenter
# Metadata options
metadataOptions:
httpEndpoint: enabled
httpProtocolIPv6: disabled
httpPutResponseHopLimit: 2
httpTokens: required
Workload-Specific NodePools
# GPU workloads
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: gpu
spec:
template:
metadata:
labels:
node-type: gpu
spec:
requirements:
- key: karpenter.k8s.aws/instance-family
operator: In
values: ["g4dn", "g5"]
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"] # GPUs usually on-demand
taints:
- key: nvidia.com/gpu
value: "true"
effect: NoSchedule
nodeClassRef:
name: gpu
---
# Spot-only for batch
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: batch
spec:
template:
metadata:
labels:
node-type: batch
spec:
requirements:
- key: karpenter.k8s.aws/instance-category
operator: In
values: ["c", "m", "r"]
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
taints:
- key: workload-type
value: batch
effect: NoSchedule
nodeClassRef:
name: default
disruption:
consolidationPolicy: WhenEmpty
consolidateAfter: 0s # Aggressive consolidation for batch
Pod Scheduling
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
template:
spec:
# Spread across zones
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfied: DoNotSchedule
labelSelector:
matchLabels:
app: api-server
# Prefer arm64 (cheaper)
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: kubernetes.io/arch
operator: In
values: ["arm64"]
# Resource requests drive instance selection
containers:
- name: api
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
memory: "1Gi"
Consolidation
Karpenter automatically consolidates nodes:
disruption:
# Consolidate underutilized nodes
consolidationPolicy: WhenUnderutilized
consolidateAfter: 30s
# Or only when empty
# consolidationPolicy: WhenEmpty
# consolidateAfter: 0s
# Budget limits how many nodes can disrupt at once
budgets:
- nodes: "10%"
- nodes: "0"
schedule: "0 9-17 * * 1-5" # No disruption during business hours
Cost Optimization
# Prioritize spot and arm64
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: kubernetes.io/arch
operator: In
values: ["arm64", "amd64"]
# Weights for instance type selection
weight: 100 # Higher weight = preferred
# Separate on-demand pool for critical workloads
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: critical
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
taints:
- key: critical
value: "true"
effect: NoSchedule
weight: 10 # Lower weight, used only when tolerated
Monitoring
# Prometheus rules
groups:
- name: karpenter
rules:
- alert: KarpenterProvisioningFailed
expr: increase(karpenter_provisioner_scheduling_duration_seconds_count{result="error"}[5m]) > 0
labels:
severity: warning
annotations:
summary: Karpenter provisioning failures
- alert: KarpenterNodeNotReady
expr: karpenter_nodes_created_total - karpenter_nodes_terminated_total - count(kube_node_status_condition{condition="Ready",status="true"}) > 0
for: 5m
labels:
severity: warning
# Check Karpenter decisions
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
# Node provisioning
kubectl get nodeclaims
# Current nodes
kubectl get nodes -L karpenter.sh/capacity-type,node.kubernetes.io/instance-type
References
- Karpenter Docs: https://karpenter.sh
- Best Practices: https://karpenter.sh/docs/concepts/best-practices
- Instance Types: https://aws.amazon.com/ec2/instance-types