Pod Affinity and Anti-Affinity
Pods can be constrained to run on specific nodes or under specific circumstances. This can include cases where you want only one application pod running per node or want pods to be paired together on a node. Additionally, when using node affinity pods can have preferred or mandatory restrictions.
For this lesson, we'll focus on inter-pod affinity and anti-affinity by scheduling the checkout-redis pods to run only one instance per node and by scheduling the checkout pods to only run one instance of it on nodes where a checkout-redis pod exists. This will ensure that our caching pods (checkout-redis) run locally with a checkout pod instance for best performance.
The first thing we want to do is see that the checkout and checkout-redis pods are running:
NAME READY STATUS RESTARTS AGE
checkout-698856df4d-vzkzw 1/1 Running 0 125m
checkout-redis-6cfd7d8787-kxs8r 1/1 Running 0 127m
We can see both applications have one pod running in the cluster. Now, let's find out where they are running:
checkout-698856df4d-vzkzw ip-10-42-11-142.us-west-2.compute.internal
checkout-redis-6cfd7d8787-kxs8r ip-10-42-10-225.us-west-2.compute.internal
Based on the results above, the checkout-698856df4d-vzkzw pod is running on the ip-10-42-11-142.us-west-2.compute.internal node and the checkout-redis-6cfd7d8787-kxs8r pod is running on the ip-10-42-10-225.us-west-2.compute.internal node.
In your environment the pods may be running on the same node initially
Let's set up a podAffinity and podAntiAffinity policy in the checkout deployment to ensure that one checkout pod runs per node, and that it will only run on nodes where a checkout-redis pod is already running. We'll use the requiredDuringSchedulingIgnoredDuringExecution to make this a requirement, rather than a preferred behavior.
The following kustomization adds an affinity section to the checkout deployment specifying both podAffinity and podAntiAffinity policies:
- Kustomize Patch
- Deployment/checkout
- Diff
apiVersion: apps/v1
kind: Deployment
metadata:
  name: checkout
  namespace: checkout
spec:
  template:
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - redis
              topologyKey: kubernetes.io/hostname
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - service
                  - key: app.kubernetes.io/instance
                    operator: In
                    values:
                      - checkout
              topologyKey: kubernetes.io/hostname
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/type: app
  name: checkout
  namespace: checkout
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: service
      app.kubernetes.io/instance: checkout
      app.kubernetes.io/name: checkout
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "8080"
        prometheus.io/scrape: "true"
      labels:
        app.kubernetes.io/component: service
        app.kubernetes.io/created-by: eks-workshop
        app.kubernetes.io/instance: checkout
        app.kubernetes.io/name: checkout
    spec:
      affinity:
        podAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - redis
              topologyKey: kubernetes.io/hostname
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - service
                  - key: app.kubernetes.io/instance
                    operator: In
                    values:
                      - checkout
              topologyKey: kubernetes.io/hostname
      containers:
        - envFrom:
            - configMapRef:
                name: checkout
          image: public.ecr.aws/aws-containers/retail-store-sample-checkout:1.2.1
          imagePullPolicy: IfNotPresent
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 3
          name: checkout
          ports:
            - containerPort: 8080
              name: http
              protocol: TCP
          resources:
            limits:
              memory: 512Mi
            requests:
              cpu: 250m
              memory: 512Mi
          securityContext:
            capabilities:
              drop:
                - ALL
            readOnlyRootFilesystem: true
          volumeMounts:
            - mountPath: /tmp
              name: tmp-volume
      securityContext:
        fsGroup: 1000
      serviceAccountName: checkout
      volumes:
        - emptyDir:
            medium: Memory
          name: tmp-volume
         app.kubernetes.io/created-by: eks-workshop
         app.kubernetes.io/instance: checkout
         app.kubernetes.io/name: checkout
     spec:
+      affinity:
+        podAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            - labelSelector:
+                matchExpressions:
+                  - key: app.kubernetes.io/component
+                    operator: In
+                    values:
+                      - redis
+              topologyKey: kubernetes.io/hostname
+        podAntiAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            - labelSelector:
+                matchExpressions:
+                  - key: app.kubernetes.io/component
+                    operator: In
+                    values:
+                      - service
+                  - key: app.kubernetes.io/instance
+                    operator: In
+                    values:
+                      - checkout
+              topologyKey: kubernetes.io/hostname
       containers:
         - envFrom:
             - configMapRef:
                 name: checkout
In the above manifest, the podAffinity section ensures:
- Checkout pods will only be scheduled on nodes where Redis pods are running.
- This is enforced by matching pods with label app.kubernetes.io/component: redis.
- The topologyKey: kubernetes.io/hostnameensures this rule applies at the node level.
The podAntiAffinity section ensures:
- Only one checkout pod runs per node.
- This is achieved by preventing pods with labels app.kubernetes.io/component: serviceandapp.kubernetes.io/instance: checkoutfrom running on the same node.
To make the change, run the following command to modify the checkout deployment in your cluster:
namespace/checkout unchanged
serviceaccount/checkout unchanged
configmap/checkout unchanged
service/checkout unchanged
service/checkout-redis unchanged
deployment.apps/checkout configured
deployment.apps/checkout-redis unchanged
The podAffinity section ensures that a checkout-redis pod is already running on the node — this is because we can assume the checkout pod requires checkout-redis to run correctly. The podAntiAffinity section requires that no checkout pods are already running on the node by matching the app.kubernetes.io/component=service label. Now, let's scale up the deployment to check the configuration is working:
Now validate where each pod is running:
checkout-6c7c9cdf4f-p5p6q ip-10-42-10-120.us-west-2.compute.internal
checkout-6c7c9cdf4f-wwkm4
checkout-redis-6cfd7d8787-gw59j ip-10-42-10-120.us-west-2.compute.internal
In this example, the first checkout pod runs on the same node as the existing checkout-redis pod, as it fulfills the podAffinity rule we set. The second one is still pending, because the podAntiAffinity rule we defined does not allow two checkout pods to get started on the same node. As the second node doesn't have a checkout-redis pod running, it will stay pending.
Next, we'll scale the checkout-redis to two instances for our two nodes, but first let's modify the checkout-redis deployment policy to spread out our checkout-redis instances across each node. To do this, we'll simply need to create a podAntiAffinity rule.
- Kustomize Patch
- Deployment/checkout-redis
- Diff
apiVersion: apps/v1
kind: Deployment
metadata:
  name: checkout-redis
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/team: database
spec:
  template:
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - redis
              topologyKey: kubernetes.io/hostname
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/created-by: eks-workshop
    app.kubernetes.io/team: database
  name: checkout-redis
  namespace: checkout
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: redis
      app.kubernetes.io/instance: checkout
      app.kubernetes.io/name: checkout
  template:
    metadata:
      labels:
        app.kubernetes.io/component: redis
        app.kubernetes.io/created-by: eks-workshop
        app.kubernetes.io/instance: checkout
        app.kubernetes.io/name: checkout
        app.kubernetes.io/team: database
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app.kubernetes.io/component
                    operator: In
                    values:
                      - redis
              topologyKey: kubernetes.io/hostname
      containers:
        - image: public.ecr.aws/docker/library/redis:6.0-alpine
          imagePullPolicy: IfNotPresent
          name: redis
          ports:
            - containerPort: 6379
              name: redis
              protocol: TCP
         app.kubernetes.io/instance: checkout
         app.kubernetes.io/name: checkout
         app.kubernetes.io/team: database
     spec:
+      affinity:
+        podAntiAffinity:
+          requiredDuringSchedulingIgnoredDuringExecution:
+            - labelSelector:
+                matchExpressions:
+                  - key: app.kubernetes.io/component
+                    operator: In
+                    values:
+                      - redis
+              topologyKey: kubernetes.io/hostname
       containers:
         - image: public.ecr.aws/docker/library/redis:6.0-alpine
           imagePullPolicy: IfNotPresent
           name: redis
In the above manifest, the podAntiAffinity section ensures:
- Redis pods are distributed across different nodes.
- This is enforced by preventing multiple pods with label app.kubernetes.io/component: redisfrom running on the same node.
- The topologyKey: kubernetes.io/hostnameensures this rule applies at the node level.
Apply it with the following command:
namespace/checkout unchanged
serviceaccount/checkout unchanged
configmap/checkout unchanged
service/checkout unchanged
service/checkout-redis unchanged
deployment.apps/checkout unchanged
deployment.apps/checkout-redis configured
The podAntiAffinity section requires that no checkout-redis pods are already running on the node by matching the app.kubernetes.io/component=redis label.
Check the running pods to verify that there are now two of each running:
NAME READY STATUS RESTARTS AGE
checkout-5b68c8cddf-6ddwn 1/1 Running 0 4m14s
checkout-5b68c8cddf-rd7xf 1/1 Running 0 4m12s
checkout-redis-7979df659-cjfbf 1/1 Running 0 19s
checkout-redis-7979df659-pc6m9 1/1 Running 0 22s
We can also verify where the pods are running to ensure the podAffinity and podAntiAffinity policies are being followed:
checkout-5b68c8cddf-bn8bp ip-10-42-11-142.us-west-2.compute.internal
checkout-5b68c8cddf-clnps ip-10-42-12-31.us-west-2.compute.internal
checkout-redis-7979df659-57xcb ip-10-42-11-142.us-west-2.compute.internal
checkout-redis-7979df659-r7kkm ip-10-42-12-31.us-west-2.compute.internal
All looks good on the pod scheduling, but we can further verify by scaling the checkout pod again to see where a third pod will deploy:
If we check the running pods we can see that the third checkout pod has been placed in a Pending state since two of the nodes already have a pod deployed and the third node does not have a checkout-redis pod running.
NAME READY STATUS RESTARTS AGE
checkout-5b68c8cddf-bn8bp 1/1 Running 0 4m59s
checkout-5b68c8cddf-clnps 1/1 Running 0 6m9s
checkout-5b68c8cddf-lb69n 0/1 Pending 0 6s
checkout-redis-7979df659-57xcb 1/1 Running 0 35s
checkout-redis-7979df659-r7kkm 1/1 Running 0 2m10s
Let's finish this section by removing the Pending pod: