Build a Robust Kubernetes Monitoring System with Prometheus and HAProxy
This guide walks you through setting up a comprehensive Kubernetes monitoring solution—covering component metrics collection, configuring HAProxy for network access, exposing metrics from kube-proxy, Calico, and kube-state-metrics, and integrating everything into Prometheus for reliable cluster health visibility.
Why Monitoring Kubernetes Matters
Kubernetes is the de facto container‑orchestration platform in modern cloud environments. As clusters grow in size and complexity, keeping the cluster and its workloads healthy requires a solid monitoring system.
Kubernetes Components to Monitor
master : etcd, apiserver, scheduler, controller‑manager
worker : kubelet, kube‑proxy
addons : coredns, calico
Beyond system components, pod status changes, deployment replica counts, and other workload metrics should be observed via kube-state-metrics .
Preparation: Certificates and Metric Collection Commands
Obtain the cluster certificates:
<code>$ sudo awk -F": " '/certificate-authority-data/ {print $NF}' /etc/kubernetes/admin.conf | base64 -d > /tmp/ca.crt
$ sudo awk -F": " '/client-certificate-data/ {print $NF}' /etc/kubernetes/admin.conf | base64 -d > /tmp/admin.crt
$ sudo awk -F": " '/client-key-data/ {print $NF}' /etc/kubernetes/admin.conf | base64 -d > /tmp/admin.key</code>Fetch metrics from master components:
<code># etcd
$ curl http://127.0.0.1:2381/metrics
# apiserver
$ curl -k --cacert /tmp/ca.crt --cert /tmp/admin.crt --key /tmp/admin.key https://<master‑IP>:6443/metrics
# controller‑manager
$ curl -k --cacert /tmp/ca.crt --cert /tmp/admin.crt --key /tmp/admin.key https://127.0.0.1:10257/metrics
# scheduler
$ curl -k --cacert /tmp/ca.crt --cert /tmp/admin.crt --key /tmp/admin.key https://127.0.0.1:10259/metrics</code>Fetch metrics from worker components:
<code># kubelet
$ curl -k --cacert /tmp/ca.crt --cert /tmp/admin.crt --key /tmp/admin.key https://<node‑IP>:10250/metrics
# kube‑proxy
$ curl 127.0.0.1:10249/metrics</code>Fetch addon metrics (coredns example):
<code># coredns
$ curl <coredns‑pod‑IP>:9153/metrics</code>Tip: Prometheus Access
Prometheus runs in a container and may need network adjustments to reach the services. Use HAProxy to expose internal metrics endpoints to the Prometheus container.
HAProxy Proxy Network
Create a ConfigMap with HAProxy configuration:
<code>apiVersion: v1
kind: ConfigMap
metadata:
name: metrics-proxy-master
namespace: kube-system
data:
haproxy.cfg: |
global
log stdout local2 info
defaults
mode tcp
log global
option tcplog
maxconn 100
timeout connect 5s
timeout client 30s
timeout server 30s
# expose metrics
listen etcd
bind *:12381
server server1 127.0.0.1:2381 check
listen kube-controller-manager
bind *:20257
server server1 127.0.0.1:10257 check
listen kube-scheduler
bind *:20259
server server1 127.0.0.1:10259 check
whitelist.lst: |
172.139.20.0/24
10.244.0.0/16</code>Deploy HAProxy as a DaemonSet:
<code>apiVersion: apps/v1
kind: DaemonSet
metadata:
name: metrics-proxy-master
namespace: kube-system
spec:
selector:
matchLabels:
app: metrics-proxy
template:
metadata:
labels:
app: metrics-proxy
spec:
containers:
- name: metrics-proxy
image: haproxy:2.8-alpine
volumeMounts:
- name: conf
mountPath: /usr/local/etc/haproxy
hostNetwork: true
nodeSelector:
node-role.kubernetes.io/control-plane: ""
tolerations:
- key: "node.kubernetes.io/unschedulable"
effect: NoSchedule
- key: "node-role.kubernetes.io/control-plane"
effect: NoSchedule
volumes:
- name: conf
configMap:
name: metrics-proxy-master</code>Expose kube‑proxy Metrics
Edit the
kube-proxyConfigMap to bind metrics to all interfaces and restart the daemon set:
<code>$ kubectl -n kube-system edit cm kube-proxy
# set metricsBindAddress: "0.0.0.0:10249"
$ kubectl -n kube-system rollout restart ds/kube-proxy
daemonset.apps/kube-proxy restarted</code>Expose Calico Metrics
Add environment variables and a port to the
calico-nodedaemon set:
<code>- name: FELIX_PROMETHEUSMETRICSENABLED
value: "True"
- name: FELIX_PROMETHEUSMETRICSPORT
value: "9091"
...
ports:
- containerPort: 9091
name: http-metrics
protocol: TCP</code>Deploy kube‑state‑metrics
Adjust the Helm values for
kube-state-metricsand upgrade Prometheus:
<code># /etc/kubernetes/addons/prometheus-values.yaml
kube-state-metrics:
fullnameOverride: kube-state-metrics
image:
registry: core.jiaxzeng.com
repository: library/monitor/kube-state-metrics
$ helm -n kube-system upgrade prometheus -f /etc/kubernetes/addons/prometheus-values.yaml /etc/kubernetes/addons/prometheus</code>Verify the service:
<code>$ curl $(kubectl -n kube-system get svc kube-state-metrics -o jsonpath='{.spec.clusterIP}:{.spec.ports[0].port}')/metrics | head</code>Collect Kubernetes Metrics in Prometheus
Edit the Prometheus ConfigMap and add job definitions for each component (etcd, apiserver, controller‑manager, scheduler, kubelet, proxy, cadvisor, calico, coredns, kube‑state‑metrics). After saving, reload the configuration:
<code>$ curl -X POST $(kubectl -n kube-system get svc prometheus -o jsonpath='{.spec.clusterIP}:{.spec.ports[0].port}')/prometheus/-/reload</code>Verification
Open the Prometheus
targetspage in a browser to confirm that all jobs are up and metrics are being scraped.
Conclusion
By implementing a comprehensive monitoring stack for a Kubernetes cluster, you gain clear insight into cluster health and can react quickly to any anomalies, making cluster management more reliable and efficient.
Linux Ops Smart Journey
The operations journey never stops—pursuing excellence endlessly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.