Understanding kubectl top: How Kubernetes Monitors Nodes and Pods
This article explains how the kubectl top command retrieves real‑time CPU and memory metrics for Kubernetes nodes and pods, details the underlying data flow, metric‑server and cAdvisor architecture, and addresses common issues and calculation differences compared to traditional system tools.
1. Introduction
kubectl top provides an easy way to view real‑time resource usage (CPU, memory) of nodes and pods. This article describes its data flow and implementation, uses it to illustrate Kubernetes monitoring, and explains common problems.
Why does kubectl top report errors?
How does kubectl top node calculate values compared to the node's native top?
How does kubectl top pod calculate values, and does it include the pause container?
Why do kubectl top pod and exec‑inside‑pod top differ?
Why do kubectl top pod and docker stats show different values?
Tested on Kubernetes 1.8 and 1.13.
2. Usage
kubectl top is a basic command but requires the appropriate metrics component to be deployed.
For versions < 1.8: deploy Heapster.
For versions ≥ 1.8: deploy metrics‑server.
kubectl top node shows node usage; kubectl top pod shows pod usage. Use
--containersto display all containers in a pod.
3. Implementation Principles
3.1 Data Flow
The data used by kubectl top, the Kubernetes dashboard, and HPA comes from the same source.
When using Heapster, the apiserver proxies metric requests directly to the Heapster service.
When using metrics‑server, the apiserver accesses metrics via the
/apis/metrics.k8s.io/endpoint.
Compared with
kubectl get podlogs, you can see the request flow.
3.2 Metric API
Heapster uses proxy forwarding, which is unstable and version‑uncontrolled. Metrics‑server and regular pods use the
/apis/metrics.k8s.io/API, providing a more reliable, authenticated interface.
Proxy forwarding is only for troubleshooting and lacks stability.
Heapster cannot leverage apiserver’s authentication and client integration.
Metrics should be a first‑class resource, exposed via a dedicated Metric API.
Since version 1.8, Kubernetes has deprecated Heapster in favor of the Metric API, implemented by metrics‑server.
3.3 kube‑aggregator
Metrics‑server exposes its API at
/apis/metrics.k8s.io. kube‑aggregator extends the apiserver, allowing custom services to register their APIs. It provides dynamic registration, discovery, aggregation, and secure proxying.
Metrics‑server registers pod and node metrics through this mechanism.
3.4 Monitoring System
Kubernetes monitoring is divided into two categories:
Core metrics: CPU, memory, etc., collected from Kubelet/cAdvisor and provided by metrics‑server for Dashboard and HPA.
Custom metrics: Exposed via Prometheus Adapter as
custom.metrics.k8s.io, allowing arbitrary Prometheus metrics to be used for HPA.
Core metrics are sufficient for most HPA scenarios; custom metrics are needed for advanced use cases like request QPS or error rates.
3.5 kubelet
Both Heapster and metrics‑server retrieve data from the kubelet API. The actual metric collection is performed by the cAdvisor module inside kubelet. Metrics can be accessed via:
Kubelet Summary metrics:
127.0.0.1:10255/metrics(node and pod summary).
cAdvisor metrics:
127.0.0.1:10255/metrics/cadvisor(container‑level data).
Example of container memory usage:
3.6 cAdvisor
cAdvisor, an open‑source Go project from Google, collects container metrics (CPU, memory, network, filesystem) and provides an HTTP API. In Kubernetes it is integrated into kubelet by default.
cAdvisor’s core logic creates a manager that uses a memory storage and sysfs instance to fetch container and machine information.
3.7 cgroup
cgroup files are the ultimate source of monitoring data. Examples:
Memory usage:
/sys/fs/cgroup/memory/docker/[containerId]/memory.usage_in_bytesMemory limit (if set):
/sys/fs/cgroup/memory/docker/[id]/memory.limit_in_bytesMemory usage ratio = usage / limit.
Typical cgroup contents include CPU, memory, disk, and network metrics.
Key memory metrics are shown in the following diagrams.
4. Issues
Common kubectl top errors include missing metrics components, pods not yet collected, or closed kubelet ports. Use
kubectl top pod -v=10for detailed logs.
No Heapster or metrics‑server deployed, or the pod is unhealthy.
Pod just created and metrics not yet available (default 1 minute).
Check if kubelet’s read‑only port (10255) is open; use the authenticated port (10250) if needed.
4.2 Memory calculation for kubectl top pod (pause container?)
Pause containers consume a few megabytes of memory but are excluded from the container list used by cAdvisor, so their memory is not counted in
kubectl top pod. The reported memory uses
container_memory_working_set_bytes, calculated as:
container_memory_usage_bytes = container_memory_rss + container_memory_cache + kernel memory container_memory_working_set_bytes = container_memory_usage_bytes – total_inactive_fileThis value is the true memory usage and the basis for OOM decisions.
4.3 How kubectl top node calculates values
kubectl top node aggregates cgroup root statistics, not the sum of all pod metrics, and differs from the traditional
topcommand on the host.
Memory relationship:
<code>rss + cache = (in)active_anon + (in)active_file</code>4.4 Difference between kubectl top pod and exec‑inside‑pod top
The host
topshows total system resources, not the pod’s allocated limits. RSS in the process view includes anonymous and mapped pages, while cgroup RSS excludes shared memory and file cache.
4.5 Difference between kubectl top pod and docker stats
docker stats uses
container_memory_usage_bytes - container_memory_cache, which differs from both
container_memory_usage_bytesand
container_memory_working_set_bytesused by kubectl top.
<code>docker stats = container_memory_usage_bytes - container_memory_cache</code>5. Conclusion
In most cases you don’t need to monitor node or pod usage manually because cluster‑autoscaler and HPA handle scaling. Persist metrics with Prometheus for historical analysis and alerts. Note that storage support in kubectl top is still missing in versions up to 1.16, and documentation may still reference Heapster for versions where metrics‑server is required.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.