Master Kubernetes Monitoring with kube-state-metrics and Prometheus
This guide walks you through deploying kube-state-metrics, configuring Prometheus scrape jobs, verifying metric collection, and adding Grafana dashboards to achieve a visible, manageable, and reliable Kubernetes monitoring solution for large‑scale clusters.
With the rapid rise of cloud‑native technologies, Kubernetes is the de‑facto platform for modern application deployment, but large clusters bring challenges in health monitoring, bottleneck detection, and network stability.
This article guides you through building a visible, manageable, and reliable container platform by installing and configuring kube-state-metrics and Prometheus.
1. Deploy kube-state-metrics
The pod monitoring relies on cAdvisor (available via kubelet) and kube-state-metrics. Deploy only kube-state-metrics:
<code># Download Chart
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts --force-update
helm pull prometheus-community/kube-state-metrics --version 5.26.0
# Pull image
docker pull registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0</code>Push the chart and image to a private Harbor repository:
<code># Push Chart
helm push kube-state-metrics-5.26.0.tgz oci://core.jiaxzeng.com/plugins
# Push image
docker tag registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.13.0 core.jiaxzeng.com/library/monitor/kube-state-metrics:v2.13.0
docker push core.jiaxzeng.com/library/monitor/kube-state-metrics:v2.13.0</code>Install the service with a custom values file:
<code>$ cat kube-state-metrics-values.yaml
fullnameOverride: kube-state-metrics
image:
registry: core.jiaxzeng.com
repository: library/monitor/kube-state-metrics
$ helm -n obs-system install kube-state-metrics -f kube-state-metrics-values.yaml kube-state-metrics</code>2. Collect container/pod metrics with Prometheus
Add scrape jobs for cAdvisor and kube-state-metrics:
<code>job_name: "k8s/cadvisor"
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
kubernetes_sd_configs:
- role: node
metrics_path: /metrics/cadvisor
relabel_configs:
- regex: __meta_kubernetes_node_label_(.+)
action: labelmap
job_name: "kube-state-metrics"
kubernetes_sd_configs:
- role: service
relabel_configs:
- action: keep
source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_service_name,__meta_kubernetes_service_port_name]
regex: obs-system;kube-state-metrics;http</code>3. Verify data collection
Query Prometheus to ensure metrics are being scraped:
<code>$ curl -s -u admin $(kubectl -n kube-system get svc prometheus -ojsonpath='{.spec.clusterIP}:{.spec.ports[0].port}')/prometheus/api/v1/query --data-urlencode 'query=up{job=~"k8s/cadvisor"}' | jq '.data.result[] | {job: .metric.job, instance: .metric.instance ,status: .value[1]}'</code> <code>$ curl -s -u admin $(kubectl -n kube-system get svc prometheus -ojsonpath='{.spec.clusterIP}:{.spec.ports[0].port}')/prometheus/api/v1/query --data-urlencode 'query=up{job=~"kube-state-metrics"}' | jq '.data.result[] | {job: .metric.job, instance: .metric.instance ,status: .value[1]}'</code>4. Add monitoring dashboards
Import Grafana dashboards (images below illustrate the result).
Conclusion
Without monitoring, a Kubernetes cluster is like a car without a dashboard. A proper observability stack lets you see current state and anticipate future issues, giving your cluster the “eyes” it needs to stay stable under any conditions.
Linux Ops Smart Journey
The operations journey never stops—pursuing excellence endlessly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.