Visualizing Business‑Process Monitoring with Grafana, Diagram & FlowCharting
This article examines the evolution of a monitoring platform, identifies key challenges such as alarm overload and fragmented data, and presents a solution that combines Grafana with Diagram and FlowCharting plugins to create business‑process‑oriented, data‑driven visualizations for faster issue resolution.
Background
Over many years we have explored and practiced building a monitoring platform, with requirements evolving alongside architecture and business scale: from Nagios/Zabbix to Prometheus; from relational and NoSQL databases to time‑series databases; from server hardware status to application availability; from servers, networks, middleware, databases to application access chains; and from traditional to cloud‑native architectures. Yet the core goal of operations remains unchanged – to serve the business.
Problem
During operation we face several issues:
Alarm overload
Alarm data scattered across monitoring subsystems
Separation of monitoring from business processes
The separation of monitoring and business is often overlooked. As architecture and scale grow, multi‑dimensional monitoring can improve application availability but cannot effectively associate with business workflows, requiring domain experts to interpret alerts, which prolongs incident resolution and harms SLA.
Requirements
Although we collect over 200,000 monitoring metrics from the business architecture, they remain isolated and cannot be linked to pinpoint problems precisely. New requirements include:
Monitoring aligned with business processes to better connect them.
Extracting distinct business‑monitoring scenarios from unordered data.
Providing visual monitoring that combines graphics, data, and business workflows.
In short, we need monitoring that is tightly coupled with business processes and visualized to quickly locate problematic nodes.
Solution
Since monitoring data resides in various databases, Elasticsearch, Prometheus, Zabbix, etc., we leverage Grafana’s multi‑data‑source capability and rich plugins for visualisation. The remaining gap is a complete business‑process layer to bind graphics and data together.
We arrived at two concrete solutions:
Grafana + Diagram
Grafana + FlowCharting
Grafana integrates the data sources, while the Diagram or FlowCharting plugins generate process‑oriented diagrams, using regex to extract data from the sources for display.
Diagram
Diagram uses the mermaid.js library to create flowcharts, sequence diagrams, and Gantt charts.
Define charts with Mermaid syntax.
Metric series can color shapes/nodes.
Series target or "alias" matches chart node IDs.
Comparison finds matches and applies styles.
Combining series aggregates multiple metrics for a node, allowing clear identification of problematic nodes.
Example Mermaid code:
<code>graph LR
LB[Load Balancer] -- route1 --> web1
LB[Load Balancer] --> web2
web1 --> app1(fa:fa-check app1)
web1 ==> app2
web2 ==> app2(fa:fa-ban app2)
web2 --> app1
app1 --> D[(database)]</code>The resulting diagram shows app2 as a combined node aggregating three metrics (app2_1, app2_2, app2_3). When a metric exceeds its threshold, the diagram highlights the specific sub‑metric, enabling rapid problem localisation.
Limitations of Diagram:
Complex business processes produce diagrams that cannot be easily zoomed or viewed in detail.
Mermaid diagrams require maintenance when business processes change.
Thresholds are global; individual metrics cannot have separate thresholds.
FlowCharting
FlowCharting displays complex charts using the online diagram library draw.io, supporting various diagram types such as network, power, traffic, industrial processes, UML, and workflow diagrams (Jenkins, Ansible Tower, OpenShift).
Key features include:
Monitoring status and performance.
Interactive chart elements.
Dynamic object display based on data or state changes.
Linking to target objects.
Using variables to modify shapes, colors, links, download paths, etc.
Regex‑based match and replace.
FlowCharting, combined with draw.io, enables real‑time data interaction within the diagram, offering more flexible metric‑specific thresholds compared to Diagram.
Regrets
Both Diagram and FlowCharting satisfy the need for graphics + data + business‑process visual monitoring, but they require:
Complete source data – continuous multi‑dimensional metric collection to enrich business‑process dependencies.
Ability to merge multiple data sources in a single dashboard – Grafana currently limits dashboards to a single data source, preventing centralized display.
The first point is a long‑term foundational effort; the second demands ongoing exploration for a breakthrough.
Conclusion
With this solution, the remaining challenge is understanding and familiarising ourselves with business processes, which requires extensive communication with business, development, and testing teams, as well as handling process changes. Only by mastering the business can operations truly serve it.
By continuously refining and visualising business workflows, teams can locate issues faster, and the intuitive visualisation helps new members get up to speed, driving overall team progress.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.