Cloud Native 11 min read

Mastering Kubernetes Pod Resource Requests, Limits, and QoS

This guide explains how to configure CPU and memory requests and limits for Kubernetes pods, implement QoS classes, use LimitRange and ResourceQuota, and monitor resource usage with Prometheus queries and Grafana dashboards to ensure stable cluster operations.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Mastering Kubernetes Pod Resource Requests, Limits, and QoS

1. Overview

Pod CPU Request and Memory Request are critical parameters. If they are omitted, Kubernetes assumes minimal resource needs and may schedule the pod on any node, which can lead to resource starvation when the cluster is under pressure. In such cases the node may evict pods, but critical pods (e.g., those handling data storage, login, or balance queries) must be protected.

Enforce resource quotas so different pods can only consume allocated resources.

Allow over‑provisioning to improve cluster utilization.

Assign QoS classes to pods; low‑priority pods are evicted first when resources are scarce.

Kubernetes nodes provide compute resources (CPU, GPU, Memory). This article focuses on CPU and Memory, as most workloads do not need GPU.

CPU and Memory are specified per container via

resources.requests

and

resources.limits

. The scheduler uses the request values to find a node with sufficient capacity.

2. Pod Resource Usage Guidelines

Pod CPU and Memory usage is dynamic and depends on load; it is expressed as a range (e.g., 0.1‑1 CPU, 500 Mi‑1 Gi memory). Two key concepts:

Requests – reserved resources required for normal operation.

Limits – maximum resources a pod may consume; for CPU this is a compressible ceiling, for Memory it is a hard limit.

If a pod exceeds its Memory limit, it is terminated by the kubelet. Therefore, Requests and Limits must be set carefully based on actual workload needs.

Example: a pod with a 1 Gi Memory request is scheduled on a node with 1.2 Gi free. After three days the pod needs 1.5 Gi, but the node only has 200 Mi left, so the pod is killed.

Pods without Limits (or with only one of CPU/Memory limits) appear flexible but are less stable than pods with all four parameters set.

When managing hundreds of pods, manually setting Requests and Limits for each is impractical. Kubernetes provides LimitRange (default values and validation) and ResourceQuota (tenant‑level caps) to automate this.

CPU Rules

Unit: millicores (m), where 10 m = 0.01 core, 1 core = 1000 m.

Requests: estimated based on actual usage.

Limits:

Requests * 1.2

(i.e., Requests + 20%).

Memory Rules

Unit: Mi, where 1024 Mi = 1 Gi.

Requests: estimated based on actual usage.

Limits:

Requests * 1.2

.

3. Namespace Resource Management

Overall Requests and Limits should not exceed 80 % of cluster capacity to leave headroom for rolling updates.

3.1 Multi‑Tenant Resource Strategy

Use

ResourceQuota

to limit resource consumption per project/team.

ResourceQuota diagram
ResourceQuota diagram

3.2 Resource Change Process

Resource change workflow
Resource change workflow

4. Resource Monitoring and Inspection

4.1 Resource Usage Monitoring

Namespace Requests usage rate

<code>sum (kube_resourcequota{type="used",resource="requests.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu"}) by (resource,namespace) * 100

sum (kube_resourcequota{type="used",resource="requests.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory"}) by (resource,namespace) * 100</code>

Namespace Limits usage rate

<code>sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace) * 100

sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace) * 100</code>

4.2 Viewing via Grafana

Grafana dashboard
Grafana dashboard

CPU request rate

<code>sum (kube_resourcequota{type="used",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.cpu",namespace=~"$NameSpace"}) by (resource,namespace)</code>

Memory request rate

<code>sum (kube_resourcequota{type="used",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="requests.memory",namespace=~"$NameSpace"}) by (resource,namespace)</code>

CPU limit rate

<code>sum (kube_resourcequota{type="used",resource="limits.cpu"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.cpu"}) by (resource,namespace)</code>

Memory limit rate

<code>sum (kube_resourcequota{type="used",resource="limits.memory"}) by (resource,namespace) / sum (kube_resourcequota{type="hard",resource="limits.memory"}) by (resource,namespace)</code>

4.3 In‑Cluster Resource Inspection

Check resource usage

<code>[root@k8s-dev-slave04 yaml]# kubectl describe resourcequotas -n cloudchain--staging

Name:            mem-cpu-demo
Namespace:       cloudchain--staging
Resource         Used   Hard
--------         ----   ----
limits.cpu       200m   500m
limits.memory    200Mi  500Mi
requests.cpu     150m   250m
requests.memory  150Mi  250Mi</code>

Check events for quota violations

<code>[root@kevin ~]# kubectl get event -n default

LAST SEEN   TYPE      REASON         OBJECT                          MESSAGE
46m         Warning   FailedCreate   replicaset/hpatest-57965d8c84   Error creating: pods "hpatest-57965d8c84-s78x6" is forbidden: exceeded quota: mem-cpu-demo, requested: limits.cpu=400m,limits.memory=400Mi, used: limits.cpu=200m,limits.memory=200Mi, limited: limits.cpu=500m,limits.memory=500Mi
... (additional similar events) ...</code>
monitoringkubernetesresource managementCPUmemoryQoSPod
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.