System Performance Issue Analysis, Diagnosis, and Optimization for Business Applications
This article explains how to analyze, diagnose, and optimize performance problems in production business systems, covering the typical causes such as high concurrency, data growth, hardware limits, and environment changes, and detailing practical steps for hardware, OS, database, middleware, JVM tuning, code review, and APM monitoring.
Today we discuss the analysis, diagnosis, and optimization of performance issues that arise in business systems after they go live. The focus is on identifying root causes and applying targeted improvements.
System Performance Issue Analysis Process
When a system that performed well in pre‑release suddenly shows severe performance degradation, the main scenarios are high concurrent access, data volume growth, and changes in critical environment factors such as network bandwidth.
First determine whether the problem exists under single‑user (non‑concurrent) conditions or only under load. Single‑user issues usually stem from code or SQL inefficiencies, while concurrent issues often require analysis of the database and middleware.
During load testing, monitor CPU, memory, and JVM to detect problems like memory leaks that may also cause performance anomalies under concurrency.
Performance Issue Influencing Factors
The factors can be grouped into three main areas: hardware environment, software runtime environment, and the application code itself.
Hardware Environment
Hardware includes compute, storage, and network resources. Even with similar TPMC ratings, X86 servers may underperform compared to mainframes. Storage I/O performance is often a hidden bottleneck that can cause high CPU and memory usage.
Linux provides built‑in monitoring tools such as iostat , ps , sar , top , and vmstat to observe CPU, memory, JVM, and disk I/O.
Running Environment – Database and Middleware
Database tuning (e.g., Oracle) involves optimizing disk I/O, rollback segments, redo logs, system global area, and object structures. Continuous monitoring is required, and DBA teams often extract high‑cost SQL statements for developers to optimize.
Middleware (WebLogic, Tomcat, etc.) tuning focuses on JVM launch parameters, thread pool sizes, and connection pool limits. In cluster deployments, horizontal scaling can alleviate load, but database clusters usually have limited scaling factors.
Typical JVM tuning parameters:
-Xmx # set maximum heap size
-Xms # set initial heap size
-XX:MaxNewSize # max new generation size
-XX:NewSize # min new generation size
-XX:MaxPermSize # max permanent generation (old model)
-XX:PermSize # min permanent generation (old model)
-XX:MaxMetaspaceSize # max metaspace size (new model)
-XX:MetaspaceSize # min metaspace size (new model)
-Xss # thread stack sizeGuidelines suggest setting Xmx/Xms to 3‑4 times the post‑FullGC old‑generation usage, Metaspace to 1.2‑1.5 times, and young generation (Xmn) to 1‑1.5 times the old‑generation size.
Software Code Issues
Common code‑level performance problems include initializing large objects inside loops, failing to release resources (memory leaks), not using caching where appropriate, long‑running transactions, and choosing sub‑optimal data structures or algorithms.
These issues are best uncovered through static code analysis, code reviews, and profiling tools.
APM and Monitoring for Performance Discovery
Performance problems can be detected via IT resource monitoring and APM (Application Performance Management) tools, or through user feedback. APM links resource usage to specific application services, SQL statements, and business functions, enabling rapid pinpointing of bottlenecks.
By integrating APM with service‑chain monitoring, teams can quickly identify which service call or SQL query is slow, dramatically improving diagnosis efficiency.
Overall, a systematic approach—starting from hardware checks, moving through OS and middleware, and finally scrutinizing application code—combined with continuous monitoring and APM, provides a comprehensive strategy for resolving business system performance issues.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.