Step-by-Step Installation and Configuration of Node Exporter, Alertmanager, Prometheus, and Grafana for Monitoring and Alerting
This guide walks through downloading, extracting, and setting up Node Exporter, Alertmanager, Prometheus, and Grafana on a Linux server, configuring their systemd services, customizing alert rules, and verifying the monitoring and alerting pipeline with screenshots of each verification step.
Installation
Download the appropriate archive from the official website, extract it, add a systemd service, and start the service.
Node_exporter
Installation commands
tar zxf node_exporter-0.17.0.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/node_exporter.service
[Unit]
Description=node_exporter
After=network.target
[Service]
Restart=on-failure
ExecStart=/usr/local/node_exporter-0.17.0.linux-amd64/node_exporter
[Install]
WantedBy=multi-user.target
systemctl start node_exporter
systemctl status node_exporter
systemctl enable node_exporterVerification
Alertmanager
Installation commands
tar zxf alertmanager-0.17.0.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager
After=network-online.target
[Service]
Restart=on-failure
ExecStart=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager --config.file=/usr/local/alertmanager-0.17.0.linux-amd64/alertmanager.yml
[Install]
WantedBy=multi-user.target
systemctl start alertmanager
systemctl status alertmanager
systemctl enable alertmanager
netstat -anlpt | grep 9093Verification
Prometheus
Shell commands
tar zxf prometheus-2.9.2.linux-amd64.tar.gz -C /usr/local
vim /etc/systemd/system/prometheus.service
[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target
[Service]
Restart=on-failure
ExecStart=/usr/local/prometheus-2.9.2.linux-amd64/prometheus --config.file=/usr/local/prometheus-2.9.2.linux-amd64/prometheus.yml --storage.tsdb.path=/var/lib/prometheus --web.external-url=http://0.0.0.0:9090
[Install]
WantedBy=multi-user.targetVerification
Grafana
Installation
Download: https://mirrors.tuna.tsinghua.edu.cn/grafana/yum/el7/grafana-5.4.2-1.x86_64.rpm
rpm -ivh grafana-5.4.2-1.x86_64.rpm
systemctl start grafana-server
systemctl status grafana-server
systemctl enable grafana-server
netstat -anlpt | grep 3000Verification
Configuration
Alertmanager
Configuration file
global:
resolve_timeout: 5m
smtp_smarthost: 'smtp.qq.com:465'
smtp_from: '[email protected]'
smtp_auth_username: '[email protected]'
smtp_auth_password: 'xxxkbpfmygbecg'
smtp_require_tls: false
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'toemail'
receivers:
- name: 'toemail'
email_configs:
- to: '[email protected]'
send_resolved: true
- name: 'web.hook'
webhook_configs:
- url: 'http://127.0.0.1:5001/'
inhibit_rules:
- source_match:
severity: 'critical'
target_match:
severity: 'warning'
equal: ['alertname', 'dev', 'instance']Prometheus
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
- "rules/host_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=
` to any timeseries scraped from this config.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: 'my target'
static_configs:
- targets: ['localhost:9100']Verification
View targets
View alert configuration
View monitoring data (https://grafana.com/dashboards/9276)
Alerting
Simulate node_exporter failure
systemctl stop node_exporter
Check email inbox
That completes a simple monitoring and alerting setup. Special thanks to online documentation. Reference: https://jianshu.com/p/e59cfd15612e
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.