Master Linux Disk Management & I/O Performance: A Hands‑On Guide from Expansion to Tuning
This comprehensive guide walks you through Linux disk space shortage scenarios, prerequisites, a quick checklist, step‑by‑step LVM and partition expansion, I/O scheduler tuning, fio benchmarking, kernel parameter optimization, Prometheus monitoring, security hardening, backup strategies, troubleshooting, and best‑practice recommendations for reliable disk management and performance.
Applicable Scenarios & Prerequisites
Production servers with low free space, I/O bottlenecks, database/storage workloads, or container platforms that need persistent volumes.
Supported OS: RHEL/CentOS 7‑9, Ubuntu 18.04‑24.04.
Root or sudo privileges.
Required tools: parted, lvm2, xfsprogs / e2fsprogs, fio, iostat (sysstat package).
Backup all critical data and take snapshots before any modification.
Perform expansion during a low‑traffic maintenance window.
Environment & Version Matrix
Kernel: 3.10+ (recommended 4.18+ for RHEL/CentOS, 4.15+ (recommended 5.4+) for Ubuntu/Debian).
LVM version 2.02+ on all platforms.
Default filesystem: XFS on RHEL/CentOS, ext4 on Ubuntu (both supported).
Minimum IOPS: HDD ≥100 IOPS, SSD ≥3000 IOPS.
Reserve at least 10 % free space before expansion.
Quick Checklist
Inspect current partitions and usage.
Identify disk type (HDD/SSD/NVMe) and current I/O scheduler.
Create LVM physical volume, volume group, and logical volume if needed.
Expand the filesystem online (XFS with xfs_growfs, ext4 with resize2fs).
Set an appropriate I/O scheduler (e.g., none for SSD, mq-deadline for HDD).
Run fio benchmarks to verify IOPS and throughput.
Configure Prometheus node_exporter alerts for disk space, inode usage, and I/O utilization.
Apply disk quotas and tighten permission controls.
Implement log cleanup and archiving policies.
Prepare LVM snapshots and rollback plans.
Implementation Steps
Step 1 – Diagnose Disk Layout
# List block devices and filesystems
lsblk -f
fdisk -l | grep "Disk /dev"
# Show usage and inode statistics
df -hT
df -i
# Find large directories
du -sh /* | sort -hr | head -10
du -h --max-depth=2 /var | sort -hr | head -20Key fields: FSTYPE – determines whether to use xfs_growfs (XFS) or resize2fs (ext4). SIZE vs MOUNTPOINT – reveals unallocated or unmounted space. df -i – inode usage >85 % requires cleanup of many small files.
Step 2 – Identify Disk Type & I/O Scheduler
# Detect SSD/NVMe (rotational=0 means SSD)
lsblk -d -o NAME,ROTA,DISC-GRAN
cat /sys/block/sda/queue/rotational # 0=SSD, 1=HDD
# Show current scheduler
cat /sys/block/sda/queue/scheduler # brackets indicate activeRecommended scheduler:
HDD (ROTA=1): deadline or cfq SSD/NVMe (ROTA=0): none (or noop) or mq-deadline Temporary change:
# echo none > /sys/block/nvme0n1/queue/schedulerPersist via udev (RHEL/CentOS example):
cat > /etc/udev/rules.d/60-ioscheduler.rules <<'EOF'
# SSD/NVMe use none
ACTION=="add|change", KERNEL=="nvme[0-9]n[0-9]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="none"
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="none"
# HDD use mq-deadline
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="mq-deadline"
EOF
udevadm control --reload-rules && udevadm triggerStep 3 – LVM Expansion (Online, No Downtime)
Scenario: add a new 100 GB disk /dev/sdb to extend /var.
# Show existing physical volumes
pvdisplay
# Create PV on the new disk
pvcreate /dev/sdb
pvdisplay /dev/sdb # verify size
# Extend the volume group
vgdisplay
vgextend vg0 /dev/sdb
vgdisplay vg0 # free PE should increase
# Extend the logical volume (example +50 GB)
lvextend -L +50G /dev/vg0/var # or -l +100%FREE
# Grow the filesystem
# XFS
xfs_growfs /var
# ext4
resize2fs /dev/vg0/var
# Verify
df -h /varRollback (create snapshot before expansion):
# Snapshot
lvcreate -L 10G -s -n var-snapshot /dev/vg0/var
# If needed, merge back
lvconvert --merge /dev/vg0/var-snapshotStep 4 – Partition Expansion (Non‑LVM)
Scenario: cloud VM system disk /dev/sda3 needs to be enlarged.
# Install growpart tool
# RHEL/CentOS
yum install -y cloud-utils-growpart
# Ubuntu
apt install -y cloud-guest-utils
# Grow partition 3 without data loss
growpart /dev/sda 3
partprobe /dev/sda
# Expand filesystem
# XFS
xfs_growfs /
# ext4
resize2fs /dev/sda3
# Manual method with parted (dangerous – backup first)
parted /dev/sda
(parted) print free
(parted) resizepart 3 100%
(parted) quit
partprobe /dev/sda
# Then run the appropriate filesystem grow commandStep 5 – I/O Performance Benchmark & Tuning
# Install fio
# RHEL/CentOS
yum install -y fio
# Ubuntu
apt install -y fioSequential write (4 MiB block, 10 GiB file):
fio --name=seqwrite --rw=write --bs=4M --size=10G \
--numjobs=1 --runtime=60 --time_based \
--directory=/var/fio-test --ioengine=libaio --iodepth=16 \
--direct=1 --group_reportingSequential read:
fio --name=seqread --rw=read --bs=4M --size=10G \
--numjobs=1 --runtime=60 --time_based \
--directory=/var/fio-test --ioengine=libaio --iodepth=16 \
--direct=1 --group_reportingRandom read/write (4 KiB block, 4 jobs):
# Random write
fio --name=randwrite --rw=randwrite --bs=4K --size=10G \
--numjobs=4 --runtime=60 --time_based \
--directory=/var/fio-test --ioengine=libaio --iodepth=32 \
--direct=1 --group_reporting
# Random read
fio --name=randread --rw=randread --bs=4K --size=10G \
--numjobs=4 --runtime=60 --time_based \
--directory=/var/fio-test --ioengine=libaio --iodepth=32 \
--direct=1 --group_reportingTarget metrics:
HDD – sequential 100‑200 MB/s, random 100‑300 IOPS.
SATA SSD – sequential 500‑550 MB/s, random 50K‑90K IOPS.
NVMe SSD – sequential 2‑7 GB/s, random 200K‑1M IOPS.
# Cleanup test files
rm -rf /var/fio-testStep 6 – Kernel Parameter Tuning
# Append to /etc/sysctl.conf
cat >> /etc/sysctl.conf <<'EOF'
# Reduce swap usage (DB servers often set to 10)
vm.swappiness = 10
# Dirty page ratios (higher for SSD)
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
# Faster writeback
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
# Max file handles for high concurrency
fs.file-max = 2097152
EOF
sysctl -pMount options for XFS (reduce metadata writes):
# Edit /etc/fstab
# Original line (example):
# /dev/mapper/vg0-var /var xfs defaults 0 0
# Optimized line:
/dev/mapper/vg0-var /var xfs defaults,noatime,nodiratime 0 0
mount -o remount /var
mount | grep /var # should show noatime,nodiratimeStep 7 – Disk Cleanup & Capacity Management
# Clean systemd journal (keep last 7 days, max 1 GB)
journalctl --vacuum-time=7d
journalctl --vacuum-size=1G
# Remove old kernels (RHEL/CentOS)
yum install -y yum-utils
package-cleanup --oldkernels --count=2
# Clean apt cache (Ubuntu)
apt clean
apt autoclean
apt autoremove --purge
# Clean Docker (if used)
docker system prune -af --volumesFind large files:
# Files >1 GB
find /var -type f -size +1G -exec ls -lh {} \; | sort -k5 -hr
# Files >100 MB not accessed in 7 days
find /var/log -type f -size +100M -atime +7Set XFS quota for a user (example appuser limited to 50 GB):
# Enable quota in /etc/fstab
/dev/mapper/vg0-var /var xfs defaults,uquota,gquota 0 0
mount -o remount /var
# Apply quota
xfs_quota -x -c 'limit bsoft=45G bhard=50G appuser' /var
xfs_quota -x -c 'report -h' /varMonitoring & Alerts
Prometheus Metrics
# Download and start node_exporter (v1.8.2 example)
wget https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-amd64.tar.gz
tar xf node_exporter-*.tar.gz
cd node_exporter-*/
./node_exporter &Key PromQL alerts (thresholds shown):
# Disk usage >85%
(1 - node_filesystem_avail_bytes{mountpoint=~"/|/var"} / node_filesystem_size_bytes{mountpoint=~"/|/var"}) * 100 > 85
# Inode usage >90%
(1 - node_filesystem_files_free{mountpoint=~"/|/var"} / node_filesystem_files{mountpoint=~"/|/var"}) * 100 > 90
# I/O utilization >80%
rate(node_disk_io_time_seconds_total[5m]) * 100 > 80
# 99th‑percentile read latency >100 ms
histogram_quantile(0.99, rate(node_disk_read_time_seconds_total[5m])) > 0.1Native Monitoring Commands
# I/O stats every 2 s
iostat -xm 2
# Show only processes doing I/O (requires root)
iotop -o
# Check I/O wait in top (wa% >20% indicates bottleneck)
topSuggested alert thresholds:
Disk usage >85 % → start cleanup.
Inode usage >90 % → delete small files.
I/O wait >20 % → investigate scheduler & application.
Average queue depth >10 → I/O saturation.
Performance & Capacity
Parameter Tuning Summary
I/O scheduler – SSD: none or mq-deadline; HDD: mq-deadline.
vm.swappiness – SSD: 10, HDD: 60 (reduce swap for DB workloads).
vm.dirty_ratio – SSD: 15, HDD: 10 (higher dirty pages on SSD).
Readahead – SSD: 256‑512 KB; HDD: 1024‑2048 KB.
Mount options – SSD: noatime,nodiratime; HDD: defaults.
Set readahead (example for /dev/sda):
# Show current value
blockdev --getra /dev/sda
# Set to 512 sectors (256 KB)
blockdev --setra 512 /dev/sda
# Persist via /etc/rc.local
echo 'blockdev --setra 512 /dev/sda' >> /etc/rc.local
chmod +x /etc/rc.localCapacity Planning
OS disk – keep 15 % free.
Database disk – keep 20 % free for temp sorting & backups.
Log disk – rotate logs, retain 30 days.
Container storage – auto‑clean unused images & volumes.
Expansion triggers:
Disk usage reaches 80 % → start expansion request.
Projected 90 % within 30 days → urgent expansion.
IOPS sustained >80 % utilization → upgrade disk tier.
Security & Compliance
Permission hardening:
# Restrict MySQL data directory
chmod 700 /var/lib/mysql
chown -R mysql:mysql /var/lib/mysql
# Restrict log directory
chmod 750 /var/log
chown root:adm /var/logAuditd monitoring (example):
# Watch critical paths
auditctl -w /var/lib/mysql -p wa -k mysql_data_change
auditctl -w /etc/fstab -p wa -k fstab_change
# Query audit logs
ausearch -k mysql_data_changeData‑at‑rest encryption with LUKS (new disk example):
# Create encrypted LUKS container
cryptsetup luksFormat /dev/sdb
cryptsetup luksOpen /dev/sdb encrypted_disk
mkfs.xfs /dev/mapper/encrypted_diskBackup strategy:
Create LVM snapshots before any expansion.
Full weekly backups retained 4 weeks.
Daily incremental backups retained 7 days.
Common Failures & Troubleshooting
Disk full (df 100 %) – Diagnose with du -sh /* | sort -hr; clean logs via journalctl --vacuum-size=1G; configure log rotation to prevent recurrence.
Inode exhaustion – Check with df -i and locate directories containing many small files; delete caches or adjust filesystem layout.
High I/O wait – Use iostat -x 1 and iotop -o to identify offending processes; rate‑limit or pause heavy jobs, then tune the scheduler.
XFS expansion failure – Run xfs_info /mount to verify; if corruption suspected, run xfs_repair -n /dev/vg0/var (dry‑run) then repair on unmounted filesystem.
LVM snapshot full – Check with lvs -a; extend snapshot size via lvextend -L +5G /dev/vg0/snap or allocate a larger snapshot initially.
Filesystem mounted read‑only – Inspect kernel messages with dmesg | grep -i error; remount read‑write with mount -o remount,rw / and address underlying disk errors.
Urgent full‑disk handling example:
# 1. Locate large directories
du -sh /* | sort -hr | head -5
# 2. Clean old logs
journalctl --vacuum-time=1d
find /var/log -name "*.log" -mtime +7 -delete
# 3. Prune Docker (if present)
docker system prune -af
# 4. Temporary LV extension if free space exists
lvextend -L +10G /dev/vg0/var && xfs_growfs /var
# 5. Verify
df -h /varChange & Rollback Playbooks
Pre‑Change Checklist
# 1. Backup critical data
tar czf /backup/var-$(date +%F).tar.gz /var/important-data
# 2. Create LVM snapshot (if applicable)
lvcreate -L 10G -s -n var-snapshot-$(date +%F) /dev/vg0/var
# 3. Record current state
df -h > /root/df-before.txt
lsblk > /root/lsblk-before.txt
pvs && vgs && lvs > /root/lvm-before.txt
# 4. Check disk health (SMART)
smartctl -H /dev/sda
smartctl -A /dev/sda | grep -i "reallocated\|pending\|uncorrectable"Expansion Execution Script (bash, idempotent)
#!/bin/bash
set -euo pipefail
NEW_DISK="/dev/sdb"
VG_NAME="vg0"
LV_NAME="var"
EXTEND_SIZE="+50G"
# Create PV if missing
if ! pvdisplay "$NEW_DISK" &>/dev/null; then
echo "Creating PV $NEW_DISK"
pvcreate "$NEW_DISK"
else
echo "PV $NEW_DISK already exists"
fi
# Extend VG if disk not present
if ! vgdisplay "$VG_NAME" | grep -q "$NEW_DISK"; then
echo "Extending VG $VG_NAME with $NEW_DISK"
vgextend "$VG_NAME" "$NEW_DISK"
else
echo "VG $VG_NAME already contains $NEW_DISK"
fi
# Extend LV
echo "Extending LV /dev/$VG_NAME/$LV_NAME by $EXTEND_SIZE"
lvextend -L $EXTEND_SIZE "/dev/$VG_NAME/$LV_NAME"
# Grow filesystem based on type
MOUNT_POINT=$(findmnt -n -o TARGET --source "/dev/$VG_NAME/$LV_NAME")
FS_TYPE=$(findmnt -n -o FSTYPE --source "/dev/$VG_NAME/$LV_NAME")
if [[ "$FS_TYPE" == "xfs" ]]; then
echo "Growing XFS on $MOUNT_POINT"
xfs_growfs "$MOUNT_POINT"
elif [[ "$FS_TYPE" == "ext4" ]]; then
echo "Growing ext4 on /dev/$VG_NAME/$LV_NAME"
resize2fs "/dev/$VG_NAME/$LV_NAME"
fi
# Verify
df -h "$MOUNT_POINT"
echo "Expansion completed"Rollback Scenarios
Filesystem expansion failure – Unmount, merge snapshot, remount:
umount /var
lvconvert --merge /dev/vg0/var-snapshot
mount /varAccidental PV removal – Restore LVM metadata from backup:
vgcfgrestore -l vg0 # list backups
vgcfgrestore -f /etc/lvm/archive/vg0_XXXXX.vg vg0
vgchange -ay vg0Disk failure – Migrate data off the failed PV and remove it:
pvmove /dev/sdb # move data to other PVs
vgreduce vg0 /dev/sdb
pvremove /dev/sdbBest Practices
Use LVM as the default storage layout for new servers – simplifies future expansion and snapshotting.
Plan separate partitions (e.g., /var, /var/log, /home) to avoid a single point of exhaustion.
Monitor disk metrics before and after any change; ensure Prometheus alerts return to normal.
Always create an LVM snapshot before modifications; size the snapshot at least twice the expected write volume during the operation.
Match I/O scheduler to media: SSD → none, HDD → mq-deadline.
Automate log cleanup via cron (e.g., journalctl --vacuum-time=30d).
Set capacity thresholds: 80 % warning, 85 % alert, 90 % urgent.
Prefer XFS for databases (large files, high throughput); ext4 is acceptable for general workloads.
Avoid online LV shrinking; migrate data to a new LV instead.
Test snapshot restore and backup recovery at least quarterly.
Appendix
A. Idempotent LVM Expansion Script
#!/bin/bash
# Usage: ./lvm_extend.sh /dev/sdb vg0 var +50G
set -euo pipefail
NEW_DISK=$1
VG_NAME=$2
LV_NAME=$3
EXTEND_SIZE=$4
# Ensure PV exists
if pvdisplay "$NEW_DISK" &>/dev/null; then
echo "$NEW_DISK already a PV"
else
pvcreate "$NEW_DISK"
fi
# Ensure VG contains the PV
if vgdisplay "$VG_NAME" | grep -q "$NEW_DISK"; then
echo "$NEW_DISK already in VG $VG_NAME"
else
vgextend "$VG_NAME" "$NEW_DISK"
fi
# Extend LV
lvextend -L $EXTEND_SIZE "/dev/$VG_NAME/$LV_NAME"
# Detect mount point and FS type
MOUNT_POINT=$(findmnt -n -o TARGET --source "/dev/$VG_NAME/$LV_NAME")
FS_TYPE=$(findmnt -n -o FSTYPE --source "/dev/$VG_NAME/$LV_NAME")
if [[ "$FS_TYPE" == "xfs" ]]; then
xfs_growfs "$MOUNT_POINT"
elif [[ "$FS_TYPE" == "ext4" ]]; then
resize2fs "/dev/$VG_NAME/$LV_NAME"
fi
df -h "$MOUNT_POINT"
echo "LVM expansion completed"B. fio Test Configuration (fio-test.ini)
[global]
ioengine=libaio
direct=1
iodepth=32
time_based
runtime=60
group_reporting
directory=/var/fio-test
[seqwrite]
rw=write
bs=4M
numjobs=1
stonewall
[seqread]
rw=read
bs=4M
numjobs=1
stonewall
[randwrite]
rw=randwrite
bs=4K
numjobs=4
stonewall
[randread]
rw=randread
bs=4K
numjobs=4
stonewallC. Prometheus Alert Rules (prometheus-disk-alerts.yml)
groups:
- name: disk_alerts
interval: 30s
rules:
- alert: DiskSpaceHigh
expr: (1 - node_filesystem_avail_bytes{mountpoint=~"/|/var"} / node_filesystem_size_bytes{mountpoint=~"/|/var"}) * 100 > 85
for: 5m
labels:
severity: warning
annotations:
summary: "Disk usage > 85% (instance: {{ $labels.instance }})"
description: "{{ $labels.mountpoint }} usage is {{ $value }}%"
- alert: DiskSpaceCritical
expr: (1 - node_filesystem_avail_bytes{mountpoint=~"/|/var"} / node_filesystem_size_bytes{mountpoint=~"/|/var"}) * 100 > 90
for: 2m
labels:
severity: critical
annotations:
summary: "Disk usage > 90% (instance: {{ $labels.instance }})"
- alert: InodeUsageHigh
expr: (1 - node_filesystem_files_free / node_filesystem_files) * 100 > 90
for: 5m
labels:
severity: warning
annotations:
summary: "Inode usage > 90% (instance: {{ $labels.instance }})"
- alert: DiskIOHigh
expr: rate(node_disk_io_time_seconds_total[5m]) * 100 > 80
for: 10m
labels:
severity: warning
annotations:
summary: "Disk I/O utilization > 80% (instance: {{ $labels.instance }})"D. Disk Health Check Script
#!/bin/bash
# SMART health check for all SATA and NVMe disks
for disk in /dev/sd? /dev/nvme?n?; do
[[ -e $disk ]] || continue
echo "=== Checking $disk ==="
# Overall health
smartctl -H $disk | grep -i "SMART overall-health"
# Key attributes
smartctl -A $disk | grep -E "Reallocated_Sector|Current_Pending_Sector|Offline_Uncorrectable"
if [[ $disk == *nvme* ]]; then
smartctl -A $disk | grep -E "Media_Errors|Percentage_Used"
fi
echo ""
doneSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
