Common Open‑Source Monitoring Systems and Zabbix Monitoring Process
The article introduces common open‑source monitoring tools such as Zabbix and Nagios, explains why distributed systems need proactive health checks, compares features, and provides a detailed Zabbix monitoring workflow including data collection, storage, visualization, alerting, and specific metrics for servers, networks, JVM and MySQL.
In the core skills of an architect, distributed architecture design is often discussed, but an equally important aspect is distributed cluster deployment and monitoring, especially which core data should be monitored in real time and trigger email alerts.
In high‑concurrency distributed environments, heavy‑traffic services and interfaces must be monitored promptly to prevent site slowdown or server avalanche scenarios such as cache avalanche; proactive monitoring of core metrics helps avoid these problems.
Common Open‑Source Monitoring Systems
1. Zabbix
Zabbix is a web‑based, enterprise‑grade open‑source operations platform that provides distributed system and network monitoring, and it is currently the most widely used monitoring software among Chinese internet users.
Easy to get started, powerful features, and free open source.
Zabbix is easy to manage and configure, can generate attractive graphs, and its auto‑discovery greatly reduces daily maintenance work. Rich data collection methods and API interfaces allow flexible data gathering, while its distributed architecture supports monitoring many devices.
2. Nagios
Nagios is an open‑source enterprise‑grade monitoring system that can monitor basic system parameters such as CPU, disk, and network, as well as services like SMTP, POP3, HTTP, and NNTP. By installing plugins and writing monitoring scripts, users can achieve application monitoring and build hierarchical monitoring architectures for large numbers of hosts.
Nagios’s biggest strength is its powerful management console; while the core product does not include monitoring code, all monitoring and alert functions are provided by plugins.
3. Open‑Source Monitoring Tools Comparison
4. Recommendation
It is recommended to start with Zabbix, the free open‑source monitoring solution. The following sections use Zabbix as an example to discuss the monitoring workflow and core monitoring indicators.
Zabbix Monitoring Process
Zabbix’s monitoring process can be simply described as:
Data collection → Data storage → Data analysis → Data display → Monitoring alarm
Data collection: Zabbix gathers data via SNMP, Agent, ICMP, SSH, IPMI, etc.
Data storage: Zabbix stores data in MySQL (or other databases).
Data display: Web UI, mobile app, or custom web interface (Java/PHP) can be used.
Data alarm: Email, WeChat, SMS alerts, and escalation mechanisms.
Zabbix’s configuration workflow:
When a trigger reaches its threshold, an event is generated. An Action then processes the event, which includes two parts:
Sending messages to users.
Executing commands to attempt automatic fault recovery.
Host groups → Hosts → Templates → Applications → Items → Graphs → Screens → Triggers → Events → Actions → Media types (alert escalation: 1. remote command 2. email) → User groups → Users → Media (alert email)
In production, Items, Triggers, and Graphs are usually managed through templates, allowing changes to be applied to all hosts that use the template.
Zabbix Monitoring Features
1. Monitoring Metrics
Host performance monitoring
Network device performance monitoring
Database performance monitoring
Multiple alert channels
Detailed reporting and charting
Zabbix agents for Linux, Windows, FreeBSD, etc.
SNMP (and optionally SSH) for network devices
2. Monitorable Objects
Devices: servers, routers, switches
Software: OS, network, applications
Host performance indicators
Fault monitoring: down hosts, unavailable services, unreachable hosts
3. Basic Monitoring Data
Main categories include:
CPU
Load
Memory
Disk
IO
Network related
Kernel parameters
ss statistics
Port collection
Process health of core services
Resource consumption of key business processes
NTP offset
DNS resolution
Fully understanding these basic monitoring options marks the point where one has mastered Linux operation principles and advanced commands.
4. JVM Monitoring
For companies whose main development language is Java, JVM monitoring is indispensable. Important JVM parameters include GC, class loading, memory, processes, threads, etc., which can be obtained via MxBeans.
5. MySQL Four Key Performance Indicators
Query throughput
Query execution performance
Connection status
Buffer pool usage
6. Business Application Monitoring
Monitoring business‑critical interfaces, such as response time, is also essential.
-END-
Mike Chen's Internet Architecture
Over ten years of BAT architecture experience, shared generously!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.