Operations 8 min read

Introduction to the Prometheus Data Collection Process

This article explains the complete Prometheus data collection workflow, covering key concepts such as targets, samples, and meta labels, detailing the relabeling steps, configuration options, example use‑cases, and the final scrape and storage phases for effective monitoring.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Introduction to the Prometheus Data Collection Process

Introduction to Prometheus Data Collection Process

Prometheus processes data from collection to storage by applying a series of transformations to targets and samples; understanding this workflow helps you use configurable parameters more effectively.

1. Introduction to Concepts Used

target: The collection target; Prometheus Server scrapes monitoring data from these devices.

sample: The data sample returned by Prometheus Server from the targets.

meta label: The original labels of a target before relabeling. They can be viewed on the Prometheus /targets page or via a GET /api/v1/targets request.

2. Data Collection Workflow

2.1 relabel (target label modification/filtering)

Relabel is a Prometheus feature that operates on targets before data collection, allowing label modification or target filtering. Important points:

Prometheus adds an instance label derived from the __address__ label.

Labels starting with __ are not stored on disk after relabel.

Meta labels remain in memory until the target is removed.

Before relabel, targets have labels like "__address__", "__metrics_path__", "__schema__", "job"; after relabel, visible labels become instance and job .

2.2 relabel Configuration

Basic relabel configuration items:

source_labels: [<labelname>, ...] – meta labels to be processed.

target_label: <labelname> – the label to write the result to (used with action "replace").

regex: <regex> – regular expression to extract content from source_labels (default "(.*)").

modulus: <uint64> – hash modulus of the source label value.

replacement: <string> – which captured group to use (default "$1").

action: <relabel_action> – operation type (default "replace").

Examples:

2.2.1 replace – modify label

Example 1: Add a "host" label extracted from the "__address__" meta label.

scrape_configs:
 - job_name: prometheus
  relabel_configs:
   - source_labels: ["__address__"]
    target_label: "host"
    regex: "(.*):(.*)"
    replacement: $1
    action: replace

Result:

Example 2 – preserve "__metrics_path__"

Replace the "__metrics_path__" meta label with a custom "metrics_path" label while keeping its value.

relabel_configs:
 - source_labels: ["__metrics_path__"]
  target_label: "metrics_path"

2.2.2 keep/drop – filter targets

Example 3 – keep only targets whose "host" label equals "localhost".

- source_labels: ["host"]
  regex: "localhost"
  action: keep

Result: only one target remains on the targets page.

3. Scrape Sample Retrieval

Prometheus scrapes metrics from targets via HTTP; the path is configured by metrics_path (default "/metrics"). The scrape timeout is set by scrape_timeout (default 10s) and can be adjusted based on network conditions. Label validity is also checked during this phase.

3.1 honor_labels – conflict resolution

If a scraped time series contains a label that conflicts with a label added by Prometheus (e.g., job , instance ), the honor_labels flag determines the outcome: true keeps the scraped label value; false renames the conflicting label with an exported_ prefix.

3.2 metric_relabel – metric label rewriting

Similar to relabel , metric_relabel operates on sample labels. It does not apply to automatically generated series such as up , scrape_duration_seconds , etc., and is typically used to filter out low‑value or high‑cost series.

3.3 Save

After all processing steps, the collected data is persisted; details of storage will be covered in a future article.

Feel free to leave comments or questions about the content above.

monitoringdata collectionconfigurationMetricsPrometheusrelabel
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.