Operations 70 min read

Linux System Performance Metrics and Monitoring Tools

This article explains the key Linux performance indicators—CPU, memory, disk I/O, file system, and network—describes how to monitor them with commands like top, vmstat, iostat, iotop, and smem, and provides practical guidance on interpreting the results to identify and resolve system bottlenecks.

Deepin Linux
Deepin Linux
Deepin Linux
Linux System Performance Metrics and Monitoring Tools

Linux system performance metrics mainly include CPU, memory, disk I/O, file system, and network. Each metric has specific command‑line tools for observation and monitoring. This article introduces common Linux performance indicators and their corresponding tools, explaining how to locate bottlenecks and optimize the system.

1. System CPU

CPU is the foundation of a stable operating system; its speed and core count largely determine overall system performance. More CPUs or higher clock rates generally improve server performance, but hyper‑threading only helps when the kernel runs an SMP configuration, and the performance gain diminishes as CPU count grows.

From a performance perspective, two 4‑core CPUs are not equivalent to eight single‑core CPUs; tests show the former can be 25‑30% slower. Applications such as mail servers and dynamic web servers are CPU‑intensive, so CPU configuration should be a primary consideration.

CPU usage depends on what workload runs on it. Simple file copies use little CPU because DMA handles most of the work; scientific calculations consume more CPU. Understanding interrupts, process scheduling, context switches, and run queues helps interpret CPU performance.

Key CPU monitoring items:

Interrupts

Context switches

Run queue length

CPU utilization

View CPU utilization

top

CPU utilization breakdown

%Cpu(s):  6.0 us,  0.5 sy,  0.0 ni, 92.4 id,  1.1 wa,  0.0 hi,  0.0 si,  0.0 st
us: user mode time percentage
sy: system mode time percentage
id: idle percentage
wa: I/O wait percentage
si: swap‑in (virtual memory) percentage
st: stolen time (virtualized environments)

2. Performance Indicators

2.1 Process

(1) Process definition

A process is an executing instance of a program, containing code, current state, and required resources (memory, files, etc.). It is the basic unit for scheduling and resource allocation in an OS.

Each process has its own address space; processes communicate via inter‑process communication (IPC). The kernel creates, schedules, terminates, and allocates resources for processes.

Different perspectives on processes

Process is a single execution of a program.

Process is a sequence of activities performed by a program on a data set, serving as an independent unit for resource allocation and scheduling.

Process has independent functionality and is the unit for resource distribution and scheduling.

(2) Process Control Block (PCB)

The PCB records a process's basic information and activity, enabling the kernel to control and manage the process. It consists of the program segment, data segment, and the PCB itself. Creating a process creates its PCB; destroying a process removes its PCB.

PCB fields include state, UID/GID, timers, registers, stack pointer, etc.

PID 0 is the scheduler (kernel) process; PID 1 is the init process; PID 2 is the page‑daemon.

(3) Process characteristics

Dynamic: a process is an active entity that can be created, scheduled, and terminated.

Concurrent: multiple processes can exist simultaneously.

Independent: each process has its own resources and scheduling.

Asynchronous: processes run at unpredictable speeds.

(4) Process states

New – just created, not yet running.

Ready – prepared to run when given CPU time.

Running – currently executing on the CPU.

Blocked – waiting for an event (e.g., I/O).

Terminated – finished execution.

Suspended – temporarily stopped, not consuming CPU.

State transitions follow the typical lifecycle (new → ready → running → blocked → ready → terminated, etc.).

(5) Process priority

Linux distinguishes ordinary and real‑time processes. Ordinary processes have static and dynamic priorities; real‑time processes have an additional real‑time priority. The nice command adjusts static priority, while the scheduler may modify dynamic priority based on I/O behavior.

# Show dynamic priority of all processes
ps axo pid,ni,pri,cmd | head -10

Example of raising priority with nice -n -20 ping 172.17.0.1 and verifying with ps shows the priority change.

(6) Relationship between process and program

Program : static collection of source or binary code, stored on disk.

Process : runtime instance of a program, with its own memory, registers, and execution context.

(7) Parent‑child relationship

The first process (init) creates all others. A parent can have many children; each child has exactly one parent. Zombies occur when a child exits but the parent has not collected its exit status; orphans are adopted by init.

(8) Process vs. thread

A thread belongs to a process; a process may contain multiple threads.

Resources are allocated to the process; threads share them.

Scheduling operates on threads, while the process is the resource container.

2.2 Memory

Memory size greatly influences Linux performance. Insufficient memory causes blocking; excessive memory wastes resources. Linux uses physical and virtual memory; virtual memory mitigates physical shortages but excessive swapping degrades performance. 64‑bit kernels are recommended for large memory.

(1) Physical vs. virtual memory

Each process has its own virtual address space, which the kernel maps to physical pages. The kernel also provides swap space on disk to extend usable memory.

(2) Why use virtual memory?

When RAM is insufficient, the system swaps rarely used pages to disk, allowing active pages to stay in RAM. This technique hides the complexity from applications.

(3) Page cache and write‑back

Linux caches file data in memory (page cache) to speed up reads. Modified pages are marked dirty and written back to disk later, reducing I/O latency.

2.3 File System

Linux file systems are classified into local, network, virtual, special, virtualization, and journaling types. Choosing a file system depends on workload, hardware, and reliability requirements. Common local file systems include ext4, XFS, and Btrfs.

RAID can improve I/O performance and reliability; different RAID levels (0, 1, 5, 6, 10) provide trade‑offs between speed and redundancy.

2.4 Disk I/O

Disk I/O directly impacts application performance. RAID, proper block sizing, and I/O schedulers (CFQ, Deadline, NOOP, BFQ, Kyber, MQ‑DEADLINE) help optimize throughput and latency.

To view the current I/O scheduler:

cat /sys/block/sda/queue/scheduler

2.5 Network

Network performance depends on topology, bandwidth, and kernel parameters (buffer sizes, window sizes, etc.). Modern networks often provide gigabit or higher speeds, reducing bottlenecks for most applications.

3. Performance Analysis Tools

3.1 CPU analysis tools

vmstat – reports virtual memory statistics, CPU, I/O, and context switches.

vmstat 1

Key fields: r (run queue), b (blocked), us (user CPU %), sy (system CPU %), id (idle %), wa (I/O wait %).

mpstat – similar to vmstat but shows per‑CPU statistics.

mpstat -P ALL 1

ps – list processes with CPU usage.

ps -eo pid,ni,pri,pcpu,psr,comm | grep oracle

3.2 Memory analysis tools

free – displays total, used, free, shared, buff/cache, and available memory.

free -m

smem – reports per‑process memory usage with USS, PSS, and RSS.

smem -k -s pss

3.3 Disk I/O analysis tools

iotop – real‑time per‑process I/O monitor.

iotop -o -b -d 1 -n 10

iostat – reports CPU and device I/O statistics.

iostat -d -k 1 10

3.4 Network analysis tools

ping – test reachability and latency.

ping -c 4 8.8.8.8

traceroute – display the route packets take.

traceroute -n example.com

mtr – combines ping and traceroute with continuous statistics.

mtr -n -c 10 -i 1 example.com

3.5 System‑wide analysis tools

top – interactive real‑time process monitor.

top

htop – enhanced, colorized version of top.

htop

4. Summary

Many performance analysis tools exist; mastering a few is sufficient. The key is to interpret the data to locate bottlenecks, understanding how CPU, memory, disk I/O, and network metrics affect Linux system behavior.

Previous highlights:

Deep Understanding of C++ Memory Management: Pointers, References, and Allocation

C/C++ Development Directions (Highly Recommended)

Linux Kernel Source Analysis (Highly Recommended)

monitoringperformancelinuxCPUmemorytoolsSystem
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.