Karpenter Setup

In this section we will configure Karpenter to allow the creation of Inferentia and Trainium EC2 instances. Karpenter can detect the pending Pods that require an inf2 or trn1 instance. Karpenter will then launch the required instance to schedule the Pod.

tip

You can learn more about Karpenter in the Karpenter module that's provided in this workshop.

Karpenter has been installed in our EKS cluster, and runs as a deployment:

~$kubectl get deployment -n kube-system

NAME        READY   UP-TO-DATE   AVAILABLE   AGE

...

karpenter   2/2     2            2           11m

Karpenter requires a NodePool to provision nodes. This is the Karpenter NodePool that we will create:

~/environment/eks-workshop/modules/aiml/inferentia/nodepool/nodepool.yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: aiml
spec:
  template:
    metadata:
      labels:
        instanceType: "neuron"
        provisionerType: "karpenter"
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values:
            - on-demand
        - key: karpenter.k8s.aws/instance-family
          operator: In
          values:
            - inf2
            - trn1
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: aiml
---
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: aiml
spec:
  amiFamily: AL2023
  amiSelectorTerms:
    - alias: al2023@latest
  blockDeviceMappings:
    - deviceName: /dev/xvda
      ebs:
        deleteOnTermination: true
        volumeSize: 100Gi
        volumeType: gp3
  role: ${KARPENTER_NODE_ROLE}
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${EKS_CLUSTER_NAME}
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: ${EKS_CLUSTER_NAME}
  tags:
    app.kubernetes.io/created-by: eks-workshop

In this section we assign what instances this NodePool is allowed to provision for us

You can see here that we've configured this NodePool to only allow the creation of inf2 and trn1 instances

Apply the NodePool and EC2NodeClass manifest:

~$kubectl kustomize ~/environment/eks-workshop/modules/aiml/inferentia/nodepool \

| envsubst | kubectl apply -f-

Now the NodePool is ready for the creation for our training and inference Pods.