How kubectl top Retrieves Real‑Time Metrics in Kubernetes: A Deep Dive
This article explains how the kubectl top command gathers real‑time CPU and memory usage for nodes and pods, details the underlying data flow and metric API implementation in Kubernetes, compares heapster and metrics‑server, and addresses common troubleshooting scenarios.
1. Introduction
kubectl top provides a convenient way to view real‑time CPU and memory usage of nodes and pods. This article explains its data flow and implementation, illustrates the Kubernetes monitoring architecture, and addresses common issues.
Why does kubectl top report errors?
How does kubectl top node calculate values compared to the node’s native top?
How does kubectl top pod calculate values, does it include the pause container?
Why does kubectl top pod differ from the top command inside the pod?
Why do values from kubectl top pod differ from docker stats?
Tested on Kubernetes 1.8 and 1.13.
2. Usage
kubectl top is a basic command but requires the appropriate metrics component to be deployed.
For versions < 1.8: deploy heapster.
For versions ≥ 1.8: deploy metrics‑server.
kubectl top node shows node usage; kubectl top pod shows pod usage. Adding
--containersdisplays per‑container metrics.
Metric meanings:
CPU unit: 100m = 0.1 core; Memory unit: 1Mi = 1024Ki.
Pod memory equals the sum of all business containers; the pause container is excluded. The value corresponds to the
container_memory_working_set_bytesmetric.
Node values are not the sum of all pod values and differ from the values shown by the host’s top command.
3. Implementation Details
3.1 Data Flow
kubectl top, the Kubernetes dashboard, and the HPA controller all consume the same metrics data.
When heapster is used, the apiserver proxies metric requests to the heapster service. With metrics‑server, the apiserver accesses metrics via the
/apis/metrics.k8s.io/endpoint.
Comparison of
kubectl get podlogs illustrates the difference.
3.2 Metric API
Heapster relies on proxy forwarding, which is unstable and lacks proper authentication. The metric‑server implements a proper Metric API that is registered as a native Kubernetes API resource.
Proxy forwarding is mainly for debugging and is not version‑stable.
Heapster cannot leverage apiserver’s authentication and client integration.
Metrics should be a first‑class resource, accessible via
metrics.k8s.io.
Since Kubernetes 1.8, heapster is being deprecated in favor of the Metric API implemented by metrics‑server.
3.3 kube‑aggregator
kube‑aggregator extends the apiserver, allowing custom services like metrics‑server to register their APIs. It handles dynamic registration, discovery, aggregation, and secure proxying.
3.4 Monitoring Architecture
Metrics are divided into two categories:
Core metrics: collected from Kubelet/cAdvisor and provided by metrics‑server for HPA and the dashboard.
Custom metrics: exposed via the Prometheus Adapter as
custom.metrics.k8s.io, enabling arbitrary Prometheus metrics to be used for scaling.
Core metrics (CPU, memory) are sufficient for most HPA scenarios; custom metrics are needed for advanced use‑cases such as scaling on request QPS or error rates.
3.5 Kubelet
Kubelet exposes two endpoints:
Kubelet Summary metrics:
127.0.0.1:10255/metrics(node and pod aggregates).
cAdvisor metrics:
127.0.0.1:10255/metrics/cadvisor(container‑level data).
Example of container memory usage:
The cAdvisor module, integrated into Kubelet since v1.6, provides the actual monitoring logic. The evolution:
v1.6: cAdvisor integrated into Kubelet.
v1.7: Kubelet metrics API no longer includes raw cAdvisor metrics.
v1.12: Direct cAdvisor port removed; all metrics served through the Kubelet API.
3.6 cAdvisor
cAdvisor, an open‑source Go project from Google, collects container‑level CPU, memory, network, and filesystem statistics and exposes them via HTTP. It is the default monitoring component in Kubernetes.
Metrics are derived from cgroup files, e.g., memory usage from
/sys/fs/cgroup/memory/docker/[containerId]/memory.usage_in_bytes.
3.7 cgroup
cgroup files provide the raw values used by cAdvisor, such as memory usage, limits, and usage ratios.
Typical memory‑related cgroup metrics include usage, limit, and usage ratio.
The most comprehensive information resides in
memory.stat.
4. Common Issues
Missing heapster or metrics‑server, or pod crashes – check pod logs.
Pod just created – metrics may not be available for up to one minute.
Verify that the Kubelet read‑only port (10255) is open, or configure certificates for the secure port (10250).
4.2 Does kubectl top pod memory include the pause container?
The pause container consumes a few megabytes, but cAdvisor excludes it from the pod’s container list, so kubectl top pod memory does not count the pause container. The reported value is
container_memory_working_set_bytes, calculated as
container_memory_usage_bytes – total_inactive_file.
4.3 How does kubectl top node differ from the host’s top?
kubectl top node aggregates cgroup root statistics, not the sum of all pod metrics, and therefore differs from the host’s top output.
<code>rss + cache = (in)active_anon + (in)active_file</code>4.4 Why does kubectl top pod differ from top inside the pod?
Top inside the pod shows host‑wide resources; kubectl top pod shows the pod’s cgroup‑based allocation, which may be limited by resource requests and limits.
4.5 Why do kubectl top pod values differ from docker stats?
docker stats uses
container_memory_usage_bytes - container_memory_cache, which yields a smaller value than the working‑set metric used by kubectl top.
<code>docker stats = container_memory_usage_bytes - container_memory_cache</code>5. Conclusion
In most cases you rely on cluster‑autoscaler and HPA rather than manually watching node or pod usage. Persisting cAdvisor data with Prometheus is recommended for historical analysis and alerting. Note that kubectl top help still mentions storage support, which is not available until v1.16, and older versions require heapster.
kubectl top help incorrectly lists heapster for versions ≥1.13.
After switching the dashboard to metrics‑server, an additional metrics‑server‑scraper pod may be needed.
Reference: http://www.xuyasong.com/?p=1781
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.