Operations 8 min read

Why Prometheus’s TSDB Makes Monitoring Scalable: A Deep Dive

This article explains how Prometheus’s time‑series database handles massive monitoring data, illustrates practical query examples, and shows why its storage engine and pre‑computation features enable efficient, high‑performance observability for large‑scale services.

Efficient Ops

Sep 5, 2021

Why Prometheus’s TSDB Makes Monitoring Scalable: A Deep Dive

Background

Many beginners feel overwhelmed by Prometheus because it introduces numerous concepts such as Instance, Job, Metric, Metric Name, Metric Label, Metric Value, Metric Type (Counter, Gauge, Histogram, Summary), DataType (Instant Vector, Range Vector, Scalar, String), Operator, and Function. Like Alibaba, which is fundamentally a data company, Prometheus is essentially a data‑driven monitoring system.

Daily Monitoring

To monitor each API request of a web server (e.g., WebServerA), dimensions include service name (job), instance IP (instance), API name (handler), method, response code, and request count.

Example SQL‑like queries on the http_requests_total metric:

SELECT * FROM http_requests_total WHERE code="200" AND method="put" AND created_at BETWEEN 1495435700 AND 1495435710;

SELECT * FROM http_requests_total WHERE handler="prometheus" AND method="post" AND created_at BETWEEN 1495435700 AND 1495435710;

SELECT * FROM http_requests_total WHERE handler="query%" AND instance="10.59.8.110" AND created_at BETWEEN 1495435700 AND 1495435710;

When monitoring 100 services, each with 10 instances, 20 APIs, 4 methods, collecting data every 30 seconds and retaining 60 days, the total data points reach roughly 13.8 billion rows, which is impractical for traditional relational databases. Therefore, Prometheus uses a Time‑Series Database (TSDB) as its storage engine.

Storage Engine

TSDB perfectly matches the characteristics of monitoring data:

Massive data volume

Predominantly write‑heavy workload

Writes are mostly sequential, ordered by timestamp

Rarely updates old data; writes occur shortly after collection

Deletion is performed in block ranges, not individual points

Data size typically exceeds memory, limiting cache effectiveness

Read operations are ordered (ascending or descending) scans

High‑concurrency reads are common

TSDB stores each sample as two parts: labels (the dimensions) and samples (timestamp and value). The label set uniquely identifies a time series (series_id).

{
  "labels": [{"latency": "500"}],
  "samples": [{"timestamp": 1473305798, "value": 0.9}]
}

A simplified series diagram:

series
│ server{latency="500"}
│ server{latency="300"}
│ server{}
│ ...
│ <--- time --->

TSDB builds three auxiliary indexes to accelerate queries:

Series Index

Stores all label key‑value pairs in lexical order and maps them to their time‑series identifiers.

Label Index

For each label, an index key points to a list of all its values and references the starting position of the corresponding series.

Time Index

Maps a time range to the file blocks that contain the data for that interval.

Data Computation

The powerful storage engine enables Prometheus to perform matrix operations on metric series using built‑in operators and functions, effectively turning the monitoring system into a combined data‑warehouse and compute platform.

One Calculation, Many Queries

Because intensive calculations consume resources, Prometheus encourages the use of recording rules to pre‑compute frequently needed or expensive expressions. The results are stored as new time series, allowing a single computation to serve multiple queries, such as dashboard refreshes and alert evaluations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring Observability Prometheus Time Series Database TSDB

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.