Missing Worker Nodes
Background
Corporation XYZ is launching a new e-commerce platform in the us-west-2 region using an EKS cluster running Kubernetes version 1.30. During a security review, several gaps were identified in the cluster's security posture, particularly around node group volume encryption and AMI customization.
The security team provided specific requirements including:
- Enabling encryption for node group volumes
- Setting up best practice network configurations
- Ensuring EKS Optimized AMIs are used
- Enabling Kubernetes auditing
Sam, an engineer with Kubernetes experience but new to EKS, created a new managed node group named new_nodegroup_1 to implement these requirements. However, no new nodes are joining the cluster despite the node group creation appearing successful. Initial checks of the EKS cluster status, node group configuration, and Kubernetes events haven't revealed any obvious issues.
Step 1: Verify Node Status
Let's first verify Sam's observation about the missing nodes:
No resources found
This confirms Sam's observation - no nodes are present from the new nodegroup (new_nodegroup_1).
Step 2: Check Managed Node Group Status
Since Managed Node Groups are responsible for creating nodes, let's examine the nodegroup details. Key aspects to check:
- Node group existence
- Status and health
- Desired size
You can also view this information in the EKS Console:

Step 3: Analyze Node Group Health Status
The nodegroup should eventually transition to a DEGRADED state. Let's examine the detailed status:
If the Workernodes workshop environment was deployed within 10 minutes, you may see nodegroup in ACTIVE state. If so, please observe the output below for your information. The nodegroup should transition to DEGRADED within 10 minutes of deployment. You can proceed to Step 4 to check the AutoScaling Group directly.
{
"nodegroup": {
"nodegroupName": "new_nodegroup_1", <<<---
"nodegroupArn": "arn:aws:eks:us-west-2:1234567890:nodegroup/eks-workshop/new_nodegroup_1/abcd1234-1234-abcd-1234-1234abcd1234",
"clusterName": "eks-workshop",
...
"status": "DEGRADED", <<<---
"capacityType": "ON_DEMAND",
"scalingConfig": {
"minSize": 0,
"maxSize": 1,
"desiredSize": 1 <<<---
},
...
"resources": {
"autoScalingGroups": [
{
"name": "eks-new_nodegroup_1-abcd1234-1234-abcd-1234-1234abcd1234"
}
]
},
"health": { <<<---
"issues": [
{
"code": "AsgInstanceLaunchFailures",
"message": "Instance became unhealthy while waiting for instance to be in InService state. Termination Reason: Client.InvalidKMSKey.InvalidState: The KMS key provided is in an incorrect state",
"resourceIds": [
"eks-new_nodegroup_1-abcd1234-1234-abcd-1234-1234abcd1234"
]
}
]
}
...
}
The health status reveals a KMS key issue preventing instance launches. This aligns with Sam's attempt to implement volume encryption.
Step 4: Investigate Auto Scaling Group Activities
Let's examine the ASG activities to understand the launch failures:
4.1. Identify Nodegroup's Auto Scaling Group Name
Run the below command to capture Nodegroup Autoscale Group name as NEW_NODEGROUP_1_ASG_NAME.
4.2. Check the AutoScaling Activities
{
"Activities": [
{
"ActivityId": "1234abcd-1234-abcd-1234-1234abcd1234",
"AutoScalingGroupName": "eks-new_nodegroup_1-abcd1234-1234-abcd-1234-1234abcd1234",
"Description": "Launching a new EC2 instance: i-1234abcd1234abcd1. Status Reason: Instance became unhealthy while waiting for instance to be in InService state. Termination Reason: Client.InvalidKMSKey.InvalidState: The KMS key provided is in an incorrect state",
"Cause": "At 2024-10-04T18:06:36Z an instance was started in response to a difference between desired and actual capacity, increasing the capacity from 0 to 1.",
...
"StatusCode": "Cancelled",
--->>> "StatusMessage": "Instance became unhealthy while waiting for instance to be in InService state. Termination Reason: Client.InvalidKMSKey.InvalidState: The KMS key provided is in an incorrect state"
},
...
]
}
You can also view this information in the EKS Console. Click on the Autoscaling group name under the Details tab to view the Autoscaling activities.

Step 5: Examine Launch Template Configuration
Let's check the Launch Template for encryption settings: