Top 10 Linux Ops Troubleshooting Tips Every Sysadmin Should Know
This article compiles ten common Linux operational problems—from shell script failures and cron output issues to disk space exhaustion and MySQL errors—detailing their causes and step‑by‑step solutions to help sysadmins quickly diagnose and resolve system faults.
Common Linux Operations Issues
1. Shell script not executing
Problem: A colleague reports a simple shell script fails with "bad interpreter: No such file or directory".
Cause: The script was edited on Windows, so each line ends with CRLF (\r\n) and contains hidden ^M characters.
Solution: 1) Rewrite the script directly on Linux. 2) Convert line endings with
vi:
:%s/\r//gand
:%s/^M//g(type ^M with
Ctrl+v Ctrl+m). Additionally, run the script with
sh -x scriptnameto trace execution.
2. Crontab output filling /var/spool/clientmqueue
Problem: The directory /var/spool/clientmqueue exceeds 100 GB.
Cause: Cron jobs produce output that is mailed to the cron user; sendmail is not running, so the output accumulates as files.
Solution: 1) Manually delete the files:
ls | xargs rm -f. 2) Suppress future output by appending
>/dev/null 2>&1to cron commands.
3. Telnet/SSH very slow
Problem: Telnet to a remote host is extremely slow, while ping works.
Cause: Reverse DNS lookup fails for the client IP, causing delays.
Solution: 1) Add proper
hostname‑
IPmappings in
/etc/hosts. 2) Comment out or replace the non‑functional nameserver in
/etc/resolv.conf.
4. MySQL "Read‑only file system" error (errno 30)
Problem: Creating a MySQL table fails with ERROR 1005 (HY000): Can't create table … (errno: 30).
Possible causes: 1) File system corruption. 2) Bad disk sectors. 3) Incorrect
fstabentries (e.g., wrong filesystem type or typo).
Solution: 1) Reboot the test machine to recover. 2) Remount the filesystem with
mount -o remount,rw /mountpointif appropriate.
5. Deleted files not freeing disk space
Problem:
df -hshows 90 GB used, but
du -sh /*totals only 30 GB.
Cause: A process still holds an open file that was deleted.
Solution: 1) Identify the offending process:
<code>/usr/sbin/lsof | grep deleted</code>2) Terminate the process (e.g.,
kill -9 25575) or close the file descriptor:
echo > /proc/25575/fd/33. 3) To delete a file that is being written, truncate it:
cat /dev/null > file.
6. Inefficient find for cleaning temporary pictures
Problem: A nightly
findscript that deletes
picture_*files consumes excessive CPU.
Cause: Scanning a directory with many files is resource‑intensive.
Solution: Use a more efficient approach:
<code>#!/bin/sh
cd /tmp
time=$(date -d "2 days ago" "+%b%d")
ls -l | grep "picture" | grep "$time" | awk '{print $NF}' | xargs rm -rf</code>7. Unable to obtain gateway MAC address
Problem: ARP fails to retrieve the MAC address of the gateway (e.g., 192.168.3.254).
Cause: ARP entry is incomplete or the network device is misconfigured.
Solution: Bind a static ARP entry, e.g.,
arp -s 192.168.3.254 00:5e:00:01:64.
8. HTTP service fails to start (port 7080 in use)
Problem: Starting Apache httpd reports "Address already in use" for port 7080.
Cause: The port is defined multiple times in configuration files.
Solution: Comment out the duplicate
Listen 7080line in
/etc/httpd/conf.d/t.10086.cn.confand restart the service.
9. "Too many open files" error
Problem: Applications hit the "too many open files" limit.
Solution: Increase limits:
<code>echo "* soft nproc 65535" >> /etc/security/limits.conf
echo "* hard nproc 65535" >> /etc/security/limits.conf
echo "* soft nofile 65535" >> /etc/security/limits.conf
echo "* hard nofile 65535" >> /etc/security/limits.conf
ulimit -n 65535
ulimit -u 65535</code>Then reboot or reload the profile.
10. ibdata1 and mysql‑bin logs consuming disk space
Problem: ibdata1 >120 GB and mysql‑bin logs >80 GB fill the disk.
Cause: InnoDB shared tablespace (ibdata1) stores all data and indexes; binary logs accumulate over time.
Solution:
For ibdata1: Dump all databases, delete the file, and recreate the tablespace.
For binary logs: Manually purge old logs with
PURGE MASTER LOGS TO 'mysql-bin.010';or
PURGE MASTER LOGS BEFORE '2010-12-22 13:00:00';, and set
expire_logs_days=30in
my.cnffor automatic cleanup.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.