Operations 21 min read

How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

This article explains why monitoring is essential for production stability, compares white‑box and black‑box approaches, and provides a step‑by‑step guide to deploying Prometheus, configuring scrape targets, using Pushgateway and Alertmanager, and scaling the solution with Thanos in a Kubernetes environment.

Efficient Ops
Efficient Ops
Efficient Ops
How to Build a Scalable Prometheus Monitoring System with Thanos on Kubernetes

Monitoring is a fundamental part of infrastructure that ensures the stability of production services; it helps detect, locate, and resolve issues through alerts and, in some cases, automated self‑healing.

White‑box vs. Black‑box Monitoring

White‑box monitoring observes internal metrics such as request count, success/failure rates, and average latency, while black‑box monitoring uses probes to verify external availability, catching problems like DNS failures that white‑box metrics miss.

Why Choose Prometheus

Prometheus was selected for its flexible PromQL query language, single‑binary deployment, Go‑based integration, built‑in Web UI, and rich ecosystem (Alertmanager, Pushgateway, Exporters).

Prometheus Architecture

Prometheus discovers targets via configuration, scrapes metrics from HTTP endpoints, stores them in a local TSDB, evaluates alert rules with PromQL, and sends alerts to Alertmanager, which can forward them to email or chat.

Metric Naming Conventions

Use base units (e.g.,

seconds

not

milliseconds

).

Prefix metric names with the application namespace, e.g.,

process_cpu_seconds_total

,

http_request_duration_seconds

.

Use suffixes to describe units, e.g.,

node_memory_usage_bytes

,

foobar_build_info

.

Metric Types

Counter : monotonically increasing (e.g., request count).

Gauge : can go up or down (e.g., CPU usage).

Histogram and Summary : capture distributions for latency or size.

Time‑Series Basics

A time series is a

(timestamp, value)

pair. Single‑dimensional series become vectors; adding labels (e.g.,

host="host1"

) creates multi‑dimensional series. Instant vectors represent a single point in time, while range vectors cover a time window.

PromQL Examples

<code>http_requests{host="host1",service="web",code="200",env="test"}</code>

Instant vector result:

<code>http_requests{host="host1",service="web",code="200",env="test"} 10</code><code>http_requests{host="host2",service="web",code="200",env="test"} 0</code><code>http_requests{host="host3",service="web",code="200",env="test"} 12</code>

Range vector query:

<code>http_requests{host="host1",service="web",code="200",env="test"}[:5m]</code>

Calculate rate:

<code>rate(http_requests{host="host1",service="web",code="200",env="test"}[:5m])</code>

Calculate increase:

<code>increase(http_requests{host="host1",service="web",code="200",env="test"}[:5m])</code>

90th percentile of a histogram:

<code>histogram_quantile(0.9, rate(employee_age_bucket_bucket[10m]))</code>

Cardinality and Storage

Each sample is stored in memory and flushed to disk every two hours. High cardinality (many label combinations) increases memory usage exponentially, so avoid using labels like user IP or request ID. Adjust

storage.tsdb.min-block-duration

and

scrape_interval

to control memory pressure.

Service Discovery and Scrape Configs

Static configs work for a few targets, but dynamic environments benefit from service discovery (Kubernetes, Consul, file‑based). Prometheus can watch files for changes and update targets automatically.

Pushgateway

For short‑lived batch jobs, Pushgateway receives metrics pushed by the job, allowing Prometheus to scrape them later. It does not expire metrics automatically, so duplicate data can appear if multiple Pushgateways are behind a load balancer; careful label management (e.g.,

honor_labels: true

) is required.

Alertmanager

Alertmanager receives alerts from Prometheus, deduplicates, groups, silences, and forwards them to notification channels such as email, WeChat, or DingTalk.

Scaling with Thanos

Thanos adds global query, high‑availability, and long‑term storage to Prometheus. The Querier aggregates results from multiple sidecars, de‑duplicates data, and supports federation. Remote Write can send data to external stores like M3DB, InfluxDB, or OpenTSDB.

Deploying Prometheus on Kubernetes

<code>apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
spec:
  serviceName: "prometheus"
  replicas: 3
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
        thanos-store-api: "true"
    spec:
      serviceAccountName: prometheus
      containers:
      - name: prometheus
        image: prom/prometheus:v2.11.1
        args:
        - --config.file=/etc/prometheus-shared/prometheus.yml
        - --web.enable-lifecycle
        - --storage.tsdb.path=/data/prometheus
        - --storage.tsdb.retention=2w
        - --storage.tsdb.min-block-duration=2h
        - --storage.tsdb.max-block-duration=2h
        - --web.enable-admin-api
        ports:
        - name: http
          containerPort: 9090
        volumeMounts:
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared
        - name: prometheus-data
          mountPath: /data/prometheus
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: http
      - name: watch
        image: watch
        args: ["-v", "-t", "-p=/etc/prometheus-shared", "curl", "-X", "POST", "--fail", "-o", "-", "-sS", "http://localhost:9090/-/reload"]
        volumeMounts:
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared
      - name: thanos
        image: improbable/thanos:v0.6.0
        command: ["/bin/sh", "-c"]
        args:
        - PROM_ID=`echo $POD_NAME| rev | cut -d '-' -f1` /bin/thanos sidecar \
          --prometheus.url=http://localhost:9090 \
          --reloader.config-file=/etc/prometheus/prometheus.yml.tmpl \
          --reloader.config-envsubst-file=/etc/prometheus-shared/prometheus.yml
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        ports:
        - name: http-sidecar
          containerPort: 10902
        - name: grpc
          containerPort: 10901
        volumeMounts:
        - name: prometheus-config
          mountPath: /etc/prometheus
        - name: prometheus-config-shared
          mountPath: /etc/prometheus-shared</code>

RBAC is required for Prometheus to read Kubernetes resources:

<code>apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources: ["services", "pods", "nodes", "nodes/proxy", "endpoints"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["create"]
- apiGroups: [""]
  resources: ["configmaps"]
  resourceNames: ["prometheus-config"]
  verbs: ["get", "update", "delete"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default
roleRef:
  kind: ClusterRole
  name: prometheus
  apiGroup: ""</code>

Deploying Thanos Components

<code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: thanos-query
spec:
  replicas: 2
  selector:
    matchLabels:
      app: thanos-query
  template:
    metadata:
      labels:
        app: thanos-query
    spec:
      containers:
      - name: thanos-query
        image: improbable/thanos:v0.6.0
        args:
        - query
        - --log.level=debug
        - --query.timeout=2m
        - --query.max-concurrent=20
        - --query.replica-label=replica
        - --query.auto-downsampling
        - --store=dnssrv+thanos-store-gateway.default.svc
        - --store.sd-dns-interval=30s
        ports:
        - name: http
          containerPort: 10902
        - name: grpc
          containerPort: 10901
        livenessProbe:
          httpGet:
            path: /-/healthy
            port: http</code>

Similar Deployments exist for Thanos Store, Thanos Ruler, Pushgateway, and Alertmanager, each exposing the necessary ports and mounting configuration via ConfigMaps.

Finally, an Ingress routes traffic to Prometheus, Thanos Query, Alertmanager, and Grafana, completing the monitoring stack.

monitoringObservabilitykubernetesPrometheusAlertmanagerThanosPushgateway
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.