Operations 20 min read

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential, compares white‑box and black‑box approaches, details Prometheus features, metric naming, query language, high‑availability challenges, and shows how to extend Prometheus with Thanos, Pushgateway, Alertmanager, and Kubernetes deployments for a robust observability stack.

Efficient Ops
Efficient Ops
Efficient Ops
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

Monitoring is a foundational infrastructure component that ensures production service stability by enabling detection, localization, and resolution of issues.

A typical white‑box monitoring focuses on internal metrics such as request count, success rate, and latency, while black‑box probes complement it by checking external availability.

Prometheus is often chosen for business monitoring because it supports flexible PromQL queries, simple single‑binary deployment, Go integration, a built‑in Web UI, and a rich ecosystem including Alertmanager, Pushgateway, and many exporters.

Prometheus metric names must consist of ASCII characters, digits, underscores, and colons, following conventions such as using base units (seconds), prefixing with the application namespace, and adding suffixes to describe the unit (e.g.,

http_request_duration_seconds

).

Counter : a monotonically increasing metric, typically used for request counts or error totals.

Gauge : a metric that can go up and down, used for CPU usage, memory consumption, etc.

Histogram and Summary : represent sampled data over time, useful for request latency or response size.

Prometheus stores data as time‑series pairs (timestamp, value). A single‑dimensional series can be expanded with labels (e.g., host) to form multi‑dimensional matrices. Queries return instant vectors (single timestamp) or range vectors (multiple timestamps).

<code>http_requests{host="host1",service="web",code="200",env="test"}</code>

Range‑vector example:

<code>http_requests{host="host1",service="web",code="200",env="test"}[:5m]</code>

Aggregations use PromQL functions such as

rate()

and

increase()

:

<code>rate(http_requests{host="host1",service="web",code="200",env="test"}[:5m])</code>
<code>increase(http_requests{host="host1",service="web",code="200",env="test"}[:5m])</code>

High cardinality labels (e.g., IP, user ID) increase memory usage exponentially and should be avoided.

Prometheus compresses data into blocks every two hours and writes them to local TSDB. To improve durability and scalability, remote read/write interfaces can forward data to external stores such as M3DB, InfluxDB, or OpenTSDB.

For large deployments, static

scrape_configs

become unwieldy; service discovery (Kubernetes, Consul, file‑based) dynamically updates target lists. Relabel rules can partition targets across multiple Prometheus instances.

Federation aggregates data from leaf Prometheus servers but suffers from single‑point failures and data duplication.

Thanos provides a global query layer that federates multiple Prometheus instances, de‑duplicates data, and offers long‑term storage.

Thanos components include Sidecar (exposes local data), Querier (aggregates queries), Store Gateway, and Ruler (alert rule evaluation).

Pushgateway caches metrics from short‑lived batch jobs, allowing Prometheus to scrape them later; however, it does not expire metrics automatically, which can lead to stale data.

Alertmanager receives alerts from Prometheus, deduplicates, groups, silences, and forwards them to notification channels such as email, WeChat, or DingTalk.

Below is a simplified Kubernetes deployment of Prometheus, Thanos Sidecar, Watch container, and related components:

<code>apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  serviceName: "prometheus"
  replicas: 3
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
        thanos-store-api: "true"
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: prom/prometheus:v2.11.1
        args:
        - --config.file=/etc/prometheus-shared/prometheus.yml
        - --web.enable-lifecycle
        - --storage.tsdb.path=/data/prometheus
        - --storage.tsdb.retention=2w
        - --storage.tsdb.min-block-duration=2h
        - --storage.tsdb.max-block-duration=2h
        - --web.enable-admin-api
        ports:
        - name: http
          containerPort: 9090
        volumeMounts:
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared
        - name: prometheus-data
          mountPath: /data/prometheus
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: http
      - name: watch
        image: watch
        args: ["-v", "-t", "-p=/etc/prometheus-shared", "curl", "-X", "POST", "--fail", "-o", "-", "-sS", "http://localhost:9090/-/reload"]
        volumeMounts:
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared
      - name: thanos
        image: improbable/thanos:v0.6.0
        args:
        - /bin/sh -c "PROM_ID=`echo $POD_NAME| rev | cut -d '-' -f1` /bin/thanos sidecar --prometheus.url=http://localhost:9090 --reloader.config-file=/etc/prometheus/prometheus.yml.tmpl --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yml"
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        ports:
        - name: http-sidecar
          containerPort: 10902
        - name: grpc
          containerPort: 10901
        volumeMounts:
        - name: prometheus-config
          mountPath: /etc/prometheus
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared</code>

RBAC configuration enables Prometheus to access Kubernetes resources:

<code>apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources: ["services","pods","nodes","nodes/proxy","endpoints"]
  verbs: ["get","list","watch"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create","get","update","delete"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default
roleRef:
  kind: ClusterRole
  name: prometheus
  apiGroup: ""</code>

Ingress resources expose Pushgateway, Prometheus, Thanos Query, Alertmanager, and Grafana:

<code>apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: prometheus-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
  - host: $(DOMAIN)
    http:
      paths:
      - backend:
          serviceName: thanos-query
          servicePort: 10901
        path: "/"
      - backend:
          serviceName: alertmanager
          servicePort: 9093
        path: "/alertmanager"
      - backend:
          serviceName: grafana
          servicePort: 3000
        path: "/grafana"</code>
monitoringObservabilitykubernetesPrometheusAlertmanagerThanosPushgateway
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.