Mastering Service Performance: CPU, Memory, JVM & Linux Monitoring Guide
This comprehensive guide explains how to monitor and tune service performance by examining CPU load, system and JVM memory usage, buffer/cache concepts, key performance metrics such as response time, throughput, QPS, and provides essential Linux tools and commands for effective operations management.
1. Service Incident Handling Process
Outline of steps to address service anomalies.
2. Load
2.1 Check CPU load
<code>top -b -n 1 | grep java | awk '{print "VIRT:"$5,"RES:"$6,"cpu:"$9"%","mem:"$10"%"}'</code>2.2 Find high‑CPU threads
Commands:
<code>top -p 25603 -H
printf 0x%x 25842
jstack 25603 | grep 0x64f2
cat /proc/interrupts</code>Monitor CPU via interrupts, context switches, runnable queue, and utilization.
3. Memory
3.1 System memory
Use
freeto view memory statistics (values shown in KB).
<code> total used free shared buffers cached
Mem: 3266180 3250000 10000 0 201000 3002000
-/+ buffers/cache: 47000 3213000
Swap: 2048276 80160 1968116</code>Explanation of each field and how to calculate usable memory.
What are buffer/cache?
Buffer cache stores block‑device data; page cache stores file data and is used by mmap and file I/O.
3.2 Process memory
3.2.1 Process memory statistics
Read
/proc/[pid]/statusfor VmSize, VmRSS, VmData, VmStk, VmExe, VmLib, etc.
<code>Name: gedit
State: S (sleeping)
Tgid: 9744
Pid: 9744
PPid: 7672
TracerPid: 0
VmPeak: 60184 kB
VmSize: 60180 kB
VmLck: 0 kB
VmHWM: 18020 kB
VmRSS: 18020 kB
VmData: 12240 kB
VmStk: 84 kB
VmExe: 576 kB
VmLib: 21072 kB
VmPTE: 56 kB
Threads: 1</code>3.2.2 JVM memory allocation
JVM divides memory into heap (young and old generations) and non‑heap (method area, metaspace, native buffers, etc.).
Method area (PermGen/Metaspace) stores class metadata.
JVM stack holds frames for each thread.
Native memory includes DirectByteBuffer, JNI libraries, and memory‑mapped files.
Typical JVM options:
<code>-XX:+UseConcMarkSweepGC -X:+CMSPermGenSweepingEnabled -X:+CMSClassUnloadingEnabled
-XX:PermSize=256M -XX:MaxPermSize=512M</code>3.2.3 Direct memory
Off‑heap memory allocated via NIO DirectByteBuffer, not part of the JVM heap.
3.2.4 JVM memory analysis
Commands to inspect heap and generations:
<code>jmap -heap [pid]
jstat -gcutil [pid]
jmap -histo:live [pid]
jmap -dump:format=b,file=heapDump [pid]
jhat -port 5000 heapDump</code>4. Service Metrics
4.1 Response Time (RT)
Measures the total time a request takes to be processed; important for user‑perceived performance.
4.2 Throughput
Number of requests processed per unit time; inversely related to response time for single‑user workloads.
4.3 Concurrent Users
Number of users that can simultaneously use the system; a coarse performance indicator.
4.4 QPS (Queries Per Second)
Measures how many queries a server handles each second; often used for DNS or database servers.
4.5 CPU Utilization & Context Switch Rate
CPU load average, context switches, and interrupt rates affect overall throughput; high switch rates can degrade performance.
5. Tools
uptime
dmesg
top
vmstat
iostat
sar
mpstat
netstat
iptraf
tcpdump
tcptrace
netperf
dstat
Source: Zane Blog, "Service Tuning".
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.