Master Linux Server Performance: Essential Monitoring Tools & How to Use Them
This guide explains how to monitor Linux server performance using built‑in tools such as top, vmstat, pidstat, iostat, netstat, sar and tcpdump, interpreting their output to diagnose CPU, memory, disk I/O and network issues quickly and effectively.
CPU and Memory Monitoring
Linux servers expose many parameters that are crucial for both operations staff and developers when troubleshooting abnormal program behavior. Simple tools like top read data from
/procand
/systo show load averages, task states, and detailed CPU usage categories (us, sy, ni, id, wa, hi, si, st).
The first line of
topdisplays 1‑, 5‑, and 15‑minute load averages; values exceeding the number of CPU cores indicate saturation. The second line lists task counts (running, sleeping, stopped, zombie). Subsequent columns break down CPU time by user, system, nice, idle, iowait, hardware interrupt, soft‑interrupt, and steal (relevant for virtualized environments).
High
ussuggests a specific process is consuming CPU, which can be identified with
topand further investigated using
perf. Elevated
syoften points to heavy I/O or kernel activity. Excessive
niindicates manually lowered priority. Large
wameans slow I/O, while high
hi/
sican signal hardware or driver problems. A noticeable
stvalue may reveal CPU over‑commitment in a VM.
The memory section of
topshows total, used, free, buffers, and cached memory.
Bufferscache raw disk metadata, whereas
Cachedstores file data.
Avail Memapproximates free + buffers + cached and indicates how much memory is available without swapping. Frequent swap activity signals memory pressure.
Other Command‑Line Tools
vmstat provides a snapshot of processes, memory, paging, block I/O, and CPU activity. Columns such as
r(runnable processes),
b(uninterruptible sleep),
swpd(used swap),
bi/
bo(blocks in/out),
in(interrupts), and
cs(context switches) help assess overall load.
pidstat focuses on a single process, reporting per‑thread CPU usage, page faults (minor
minflt/s, major
majflt/s), stack size, and context‑switch rates. Options like
-tshow thread‑level details,
-rdisplays memory faults,
-sshows stack usage,
-ureports CPU percentages, and
-wreports voluntary and involuntary switches.
For per‑CPU load balancing on SMP systems, mpstat (e.g.,
mpstat -P ALL 1) reveals each core’s utilization.
Disk I/O Monitoring
iostat (e.g.,
iostat -xz 1) and sar -d report device‑level metrics: average queue length (
avgqu-sz), average wait (
await), service time (
svctm), and utilization (
%util). Values >1 for queue length or >60 % utilization indicate potential bottlenecks.
iotop shows real‑time per‑process disk throughput, while lsof can identify which processes hold open files or devices, useful for diagnosing un‑unmountable partitions.
Network Monitoring
netstat (e.g.,
netstat -s) displays cumulative protocol statistics since boot; paired with
watchor differential checks it can reveal current network health. Common flags include
-antpfor all TCP connections and
-nltpfor listening sockets.
sar with
-n TCP,ETCP 1or
-n UDP 1reports per‑second TCP/UDP activity, including connection attempts, retransmissions, and error rates, helping assess network reliability.
tcpdump captures raw packets for offline analysis with Wireshark. Using filters (e.g., host, port, protocol) limits capture size and impact on the system.
These tools together form a comprehensive toolbox for Linux performance monitoring, enabling quick identification of CPU, memory, disk, and network issues.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.