Operations 21 min read

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential for production stability, compares white‑box and black‑box approaches, details the advantages of Prometheus, walks through its architecture, metric types, query language, high‑availability strategies with Thanos, and provides practical Kubernetes deployment manifests and configuration tips.

Efficient Ops
Efficient Ops
Efficient Ops
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

Monitoring is a fundamental part of infrastructure that ensures the stability of production services; it helps detect, locate, and resolve issues quickly.

White‑box monitoring observes internal metrics such as request count, success rate, and latency, while black‑box monitoring uses probes to verify external availability, complementing white‑box data.

Prometheus was chosen for its flexible PromQL query language, single‑binary deployment, Go‑based integration, built‑in Web UI, and rich ecosystem (Alertmanager, Pushgateway, Exporters).

Prometheus scrapes targets defined in

scrape_configs

, stores samples in a local TSDB, and evaluates alert rules via PromQL. Metric names must use ASCII characters and follow naming conventions (e.g.,

process_cpu_seconds_total

,

http_request_duration_seconds

).

Metrics are stored as time series (

(timestamp, value)

). Single‑dimensional series become vectors; adding labels creates multi‑dimensional series. Queries return instant vectors or range vectors, which can be aggregated with functions like

rate()

,

increase()

, or

histogram_quantile()

.

<code>http_requests{host="host1",service="web",code="200",env="test"}</code>

High availability can be achieved with federation, but it has limitations. Thanos provides a global query layer that aggregates data from multiple Prometheus instances via sidecars and stores data in object storage.

Thanos Querier receives requests, forwards them to sidecars, aggregates results, and executes PromQL queries, supporting deduplication and high‑availability.

Prometheus supports remote read/write to external storage (e.g., M3DB, InfluxDB, OpenTSDB) for durable, scalable retention.

Service discovery (Kubernetes, Consul, file‑based) replaces static target lists, allowing dynamic scaling of monitored instances.

Pushgateway collects metrics from short‑lived jobs and caches them for Prometheus to scrape, but it does not expire metrics automatically and can cause duplication in load‑balanced setups.

Alertmanager receives alerts from Prometheus, deduplicates, groups, silences, and forwards them to notification channels (e.g., email, WeChat, DingTalk).

Deploying Prometheus on Kubernetes involves a

StatefulSet

with containers for Prometheus, a watch sidecar that reloads configuration on changes, and a Thanos sidecar. Example manifest:

<code>apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  labels:
    app: prometheus
spec:
  serviceName: "prometheus"
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
        thanos-store-api: "true"
    spec:
      serviceAccountName: prometheus
      volumes:
      - name: prometheus-config
        configMap:
          name: prometheus-config
      - name: prometheus-data
        hostPath:
          path: /data/prometheus
      - name: prometheus-config-shared
        emptyDir: {}
      containers:
      - name: prometheus
        image: prom/prometheus:v2.11.1
        args:
          - --config.file=/etc/prometheus-shared/prometheus.yml
          - --web.enable-lifecycle
          - --storage.tsdb.path=/data/prometheus
          - --storage.tsdb.retention=2w
          - --storage.tsdb.min-block-duration=2h
          - --storage.tsdb.max-block-duration=2h
          - --web.enable-admin-api
        ports:
          - name: http
            containerPort: 9090
        volumeMounts:
          - name: prometheus-config-shared
            mountPath: /etc/prometheus-shared
          - name: prometheus-data
            mountPath: /data/prometheus
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: http
      - name: watch
        image: watch
        args: ["-v", "-t", "-p=/etc/prometheus-shared", "curl", "-X", "POST", "--fail", "-o", "-", "-sS", "http://localhost:9090/-/reload"]
        volumeMounts:
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared
      - name: thanos
        image: improbable/thanos:v0.6.0
        command: ["/bin/sh", "-c"]
        args:
          - PROM_ID=`echo $POD_NAME| rev | cut -d '-' -f1` /bin/thanos sidecar \
            --prometheus.url=http://localhost:9090 \
            --reloader.config-file=/etc/prometheus/prometheus.yml.tmpl \
            --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yml
        env:
          - name: POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
        ports:
          - name: http-sidecar
            containerPort: 10902
          - name: grpc
            containerPort: 10901
        volumeMounts:
          - name: prometheus-config
            mountPath: /etc/prometheus
          - name: prometheus-config-shared
            mountPath: /etc/prometheus-shared</code>

RBAC is required for Prometheus to access Kubernetes resources:

<code>apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources: ["services", "pods", "nodes", "nodes/proxy", "endpoints"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["prometheus-config"]
  verbs: ["get", "update", "delete"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default
roleRef:
  kind: ClusterRole
  name: prometheus
  apiGroup: ""</code>

Thanos Querier deployment example:

<code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: thanos-query
  labels:
    app: thanos-query
spec:
  replicas: 2
  selector:
    matchLabels:
      app: thanos-query
  template:
    metadata:
      labels:
        app: thanos-query
    spec:
      containers:
      - name: thanos-query
        image: improbable/thanos:v0.6.0
        args:
          - query
          - --log.level=debug
          - --query.timeout=2m
          - --query.max-concurrent=20
          - --query.replica-label=replica
          - --query.auto-downsampling
          - --store=dnssrv+thanos-store-gateway.default.svc
          - --store.sd-dns-interval=30s
        ports:
          - name: http
            containerPort: 10902
          - name: grpc
            containerPort: 10901
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: http</code>

Pushgateway deployment:

<code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: pushgateway
  labels:
    app: pushgateway
spec:
  replicas: 15
  selector:
    matchLabels:
      app: pushgateway
  template:
    metadata:
      labels:
        app: pushgateway
    spec:
      containers:
      - name: pushgateway
        image: prom/pushgateway:v1.0.0
        ports:
          - name: http
            containerPort: 9091
        resources:
          limits:
            memory: 1Gi
          requests:
            memory: 512Mi</code>

Alertmanager deployment:

<code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: alertmanager
spec:
  replicas: 3
  selector:
    matchLabels:
      app: alertmanager
  template:
    metadata:
      labels:
        app: alertmanager
    spec:
      containers:
      - name: alertmanager
        image: prom/alertmanager:latest
        args:
          - --web.route-prefix=/alertmanager
          - --config.file=/etc/alertmanager/config.yml
          - --storage.path=/alertmanager
          - --cluster.listen-address=0.0.0.0:8001
          - --cluster.peer=alertmanager-peers.default:8001
        ports:
          - name: alertmanager
            containerPort: 9093
        volumeMounts:
          - name: alertmanager-config
            mountPath: /etc/alertmanager
          - name: alertmanager
            mountPath: /alertmanager
      volumes:
      - name: alertmanager-config
        configMap:
          name: alertmanager-config
      - name: alertmanager
        emptyDir: {}</code>

Ingress resources can expose Pushgateway, Prometheus, Thanos Query, Alertmanager, and Grafana via Nginx.

Accessing the Prometheus UI shows that monitoring nodes are healthy.

monitoringObservabilitykubernetesDevOpsPrometheusThanos
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.