Master Blackbox Exporter: Install, Configure, and Alert with Prometheus
This guide walks through the concepts of white‑box vs black‑box monitoring, explains Prometheus Blackbox Exporter capabilities, shows step‑by‑step installation, Kubernetes configuration, probe definitions for HTTP, TCP, ICMP and SSL, and provides ready‑to‑use alert rules and Grafana dashboard integration.
Overview
In monitoring systems we usually distinguish white‑box monitoring, which focuses on internal metrics, and black‑box monitoring, which observes external symptoms such as alerts or failed endpoints.
Black‑box monitoring concentrates on observable phenomena (e.g., an alarm or a non‑responsive business interface) from the user’s perspective, aiming to alert on ongoing failures.
White‑box monitoring concentrates on internal indicators (e.g., Redis info showing a slave down) to diagnose root causes of the failures observed by black‑box probes.
Blackbox Exporter
Blackbox Exporter is the official Prometheus solution for black‑box monitoring, allowing probes via HTTP, HTTPS, DNS, TCP and ICMP.
1. HTTP probe
Define request headers
Validate HTTP status, response headers and body
2. TCP probe
Port status listening
Application‑layer protocol definition and listening
3. ICMP probe
Host reachability (ping)
4. POST probe
Endpoint connectivity
5. SSL certificate expiration
Blackbox Exporter can also retrieve SSL certificate expiry information.
Install Blackbox Exporter
(1) Create a YAML manifest (blackbox-deployment.yaml):
<code>apiVersion: v1
kind: Service
metadata:
name: blackbox
namespace: monitoring
labels:
app: blackbox
spec:
selector:
app: blackbox
ports:
- port: 9115
targetPort: 9115
---
apiVersion: v1
kind: ConfigMap
metadata:
name: blackbox-config
namespace: monitoring
data:
blackbox.yaml: |-
modules:
http_2xx:
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1","HTTP/2"]
valid_status_codes: [200]
method: GET
preferred_ip_protocol: "ip4"
http_post_2xx:
prober: http
timeout: 10s
http:
valid_http_versions: ["HTTP/1.1","HTTP/2"]
valid_status_codes: [200]
method: POST
preferred_ip_protocol: "ip4"
tcp_connect:
prober: tcp
timeout: 10s
ping:
prober: icmp
timeout: 5s
icmp:
preferred_ip_protocol: "ip4"
dns:
prober: dns
dns:
transport_protocol: "tcp"
preferred_ip_protocol: "ip4"
query_name: "kubernetes.default.svc.cluster.local"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: blackbox
namespace: monitoring
spec:
selector:
matchLabels:
app: blackbox
template:
metadata:
labels:
app: blackbox
spec:
containers:
- name: blackbox
image: prom/blackbox-exporter:v0.18.0
args:
- "--config.file=/etc/blackbox_exporter/blackbox.yaml"
- "--log.level=error"
ports:
- containerPort: 9115
volumeMounts:
- name: config
mountPath: /etc/blackbox_exporter
volumes:
- name: config
configMap:
name: blackbox-config</code>(2) Apply the manifest:
<code>kubectl apply -f blackbox-deployment.yaml</code>Configure Monitoring
Because the cluster uses Prometheus Operator, additional scrape configurations are added via a secret.
(1) Create
prometheus-additional.yamlwith jobs for HTTP, DNS, ICMP, etc.
<code>- job_name: "ingress-endpoint-status"
metrics_path: /probe
params:
module: [http_2xx] # Expect HTTP 200
static_configs:
- targets:
- http://172.17.100.134/healthz
labels:
group: nginx-ingress
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox.monitoring:9115
- job_name: "kubernetes-service-dns"
metrics_path: /probe
params:
module: [dns]
static_configs:
- targets:
- kube-dns.kube-system:53
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: blackbox.monitoring:9115
- job_name: "node-icmp-status"
... (other jobs for ICMP, TCP, etc.)</code>(2) Create the secret containing the additional configuration:
<code>kubectl -n monitoring create secret generic additional-config --from-file=prometheus-additional.yaml</code>(3) Edit
prometheus-prometheus.yamlto reference the secret:
<code>additionalScrapeConfigs:
name: additional-config
key: prometheus-additional.yaml</code>(4) Re‑apply the Prometheus custom resource and reload the server:
<code>kubectl apply -f prometheus-prometheus.yaml
curl -X POST "http://<PROMETHEUS_IP>:9090/-/reload"</code>ICMP Monitoring
Ping targets are defined in a job named "node-icmp-status". After reloading, the targets appear in the Prometheus UI.
HTTP Monitoring
GET probes are defined for URLs such as https://www.coolops.cn and https://www.baidu.com. After reload, the status is visible in Prometheus.
TCP Monitoring
TCP probes check ports of middleware services (e.g., 172.17.100.135:80, 172.17.100.74:3306). Results are shown after reloading.
Alert Rules
Business health: monitor
probe_success(0 = failure, 1 = success).
SSL certificate expiration:
probe_ssl_earliest_cert_expirycan be used to alert when less than 30 days remain.
<code>groups:
- name: blackbox_network_stats
rules:
- alert: blackbox_network_stats
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: "Interface/host/port connectivity failure"
description: "Interface/host/port {{ $labels.instance }} connectivity abnormal"
- name: check_ssl_status
rules:
- alert: "SSL certificate expiration warning"
expr: (probe_ssl_earliest_cert_expiry - time())/86400 < 30
for: 1h
labels:
severity: warn
annotations:
description: "Domain {{ $labels.instance }} certificate expires in {{ printf \"%.1f\" $value }} days"
summary: "SSL certificate expiration warning"</code>Grafana Dashboard
Import dashboard ID 12559 to visualize the blackbox metrics.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.