Operations 6 min read

Why Pods Get Evicted: Diagnosing DiskPressure in Kubernetes Nodes

This article walks through a real‑world Kubernetes incident where a node’s disk usage exceeded the eviction threshold, causing pods to enter the Evicted state, and details the investigation steps, root‑cause analysis, and practical remediation actions.

WeiLi Technology Team
WeiLi Technology Team
WeiLi Technology Team
Why Pods Get Evicted: Diagnosing DiskPressure in Kubernetes Nodes

Introduction

Previously we discussed NotReady conditions caused by memory shortage; this post shares a case where insufficient disk space on a host led to pod eviction.

Symptom

An alert indicated a large number of pods were in the Evicted state.

Investigation

All evicted pods were scheduled on the same node. Inside the pod the status showed DiskPressure . Host metrics revealed normal CPU, memory, and load, but disk usage was at 84%.

Viewing the kubelet configuration showed the default eviction thresholds:

<code>cat /etc/kubernetes/kubelet/kubelet-config.json
---
...
"evictionHard": {
  "memory.available": "100Mi",
  "nodefs.available": "10%",
  "nodefs.inodesFree": "5%"
}
...</code>

The kubelet process list confirmed the node was running the standard AWS EKS components.

<code>[root@ip-10-153-13-121 ~]# ps -ef| grep kube
root      3226     1  2 Nov03 ?        13:56:53 /usr/bin/kubelet ...
root      3683  3385  0 Nov03 ?        00:08:24 kube-proxy ...
...</code>

AWS defaults to evict pods when disk usage exceeds 85%, which matched the observed 84‑85% usage.

Root Cause

The node’s disk is small and the kubelet’s default eviction condition (85% usage) triggered pod eviction.

Because the cluster has few nodes, evicted pods were rescheduled onto the same overloaded node, creating a loop.

The container runtime is containerd , so Docker‑specific cleanup commands are ineffective, and large container snapshots occupied significant space.

Resolution

Cleaning the host disk revealed that only two business pods were running and their disk consumption was minimal. Large container snapshots and unused images were removed, freeing space.

Optimization & Solutions

Customize managed nodes and modify the kubelet eviction condition, raising the disk‑usage threshold from 85% to 95%.

Update the Karpenter node template to provision nodes with a larger disk (e.g., 200 GB) instead of the default 20 GB.

Adjust disk‑monitoring alert thresholds to be lower than the kubelet eviction threshold, allowing early detection of disk pressure.

operationsKubernetesAWSkubeletPod EvictionDiskPressureKarpenter
WeiLi Technology Team
Written by

WeiLi Technology Team

Practicing data-driven principles and believing technology can change the world.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.