Operations 12 min read

Online Monitoring: Principles, Scope, Types, Implementation and Value Assessment

This article explores the essential concepts of online monitoring, including effective monitoring items, objectives, scope, system and business monitoring types, stakeholder considerations, implementation steps, tool choices, alert strategies, and how to evaluate the overall value of monitoring initiatives.

360 Quality & Efficiency
360 Quality & Efficiency
360 Quality & Efficiency
Online Monitoring: Principles, Scope, Types, Implementation and Value Assessment

01 Introduction

Recent discussions with senior engineers highlighted the importance of online monitoring for production systems, prompting questions about which monitoring is effective, its purpose, and the subsequent operational workflow after alerts are triggered.

02 Business System Analysis

Before implementing monitoring, a deep understanding of the business system is required; early‑stage or fast‑iteration systems should focus on core scenarios such as user activity and transaction volume, while aligning monitoring points with the technology stack to avoid excessive maintenance overhead.

03 Monitoring Framework

Based on experience, a structured monitoring framework can be built, covering monitoring goals, scope, types, and stakeholders.

3.1 Monitoring Goals

The primary goal is to quickly perceive online issues, accurately locate their source, and minimize impact with the least cost.

3.2 Monitoring Scope

Define the scope by mapping system architecture, technology stack, dependencies, and prioritising gaps: identify existing points, discard obsolete ones, add missing critical points, and ensure alert mechanisms reach the right people.

3.3 Monitoring Types

Monitoring is divided into system monitoring and business monitoring.

3.3.1 System Monitoring

Focuses on the environment and resources that keep the system running, split into four layers: Resource, Infrastructure, Data, and Dependencies.

Resource Layer : CPU, memory, disk, network, middleware queues, etc.

Infrastructure Layer : JVM threads, memory reclamation, method calls, logs, etc.

Data Layer : Database connections, slow queries, table sizes, read/write performance.

Dependencies : Call chains, latency, unauthorized calls.

3.3.2 Business Monitoring

Targets system functionality and business data, covering functional monitoring (user actions, interface health, scheduled tasks) and data monitoring (business metrics, thresholds, data consistency).

3.4 Monitoring Stakeholders

Different roles prioritize different monitoring points: product teams focus on user behavior and business metrics, while technical teams watch system resources and error logs; clear stakeholder mapping prevents noisy alerts.

04 Monitoring Implementation

After defining monitoring points, design an implementation plan that includes monitoring layers, items, specific points, tools, alert strategies, notification channels, and responsible parties. Example tools: Prometheus for metrics collection and Grafana for visualization.

05 Monitoring Value Evaluation

Assess the effectiveness of each monitoring item against the original goal of improving issue perception and reducing troubleshooting cost; discard ineffective or noisy alerts that cause alert fatigue.

06 Conclusion

Effective monitoring requires thoughtful design, stakeholder alignment, and continuous evaluation to ensure alerts are meaningful and help maintain system reliability.

monitoringperformanceoperationsObservabilityalertingsystem monitoringBusiness Monitoring
360 Quality & Efficiency
Written by

360 Quality & Efficiency

360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.