Operations 12 min read

Top 10 Linux Ops Troubleshooting Tips Every Sysadmin Should Know

This article compiles ten common Linux operational problems—from shell script failures and cron output issues to disk space leaks and MySQL storage errors—detailing their causes and step‑by‑step solutions to help engineers quickly diagnose and resolve system faults.

Efficient Ops
Efficient Ops
Efficient Ops
Top 10 Linux Ops Troubleshooting Tips Every Sysadmin Should Know

As a Linux operations engineer, encountering various problems and failures is inevitable; summarizing experiences, investigating root causes, and documenting solutions is a good habit that turns practice into valuable knowledge.

The following list gathers ten typical issues you may meet during projects, along with their causes and fixes.

Common Linux Issues and Solutions

1. Shell script does not execute

Problem: A colleague reports a simple shell script fails with "bad interpreter: No such file or directory".

Cause: The script was edited on Windows, introducing CRLF line endings (\r\n) which appear as ^M in Linux.

Solution:

Rewrite the script directly on Linux.

Remove Windows line endings with

vi:%s/\r//g

and

:%s/^M//g

(type ^M with Ctrl+V, Ctrl+M).

Use

sh -x script.sh

for step‑by‑step execution and debugging.

2. Crontab output fills /var/spool/clientmqueue

Problem: The

/var/spool/clientmqueue

directory exceeds 100 GB.

Cause: Cron jobs produce output that is mailed to the cron user; because

sendmail

is not running, the messages accumulate as files.

Solution:

Manually delete the files:

ls | xargs rm -f

Suppress output in cron entries by appending

>/dev/null 2>&1

to the command.

3. Telnet/SSH is slow

Problem: Telnet to a remote host is sluggish, while ping works and DNS lookup fails.

Cause: Reverse DNS lookup on the client’s IP is timing out.

Solution:

Add the correct

hostname

IP

mapping to

/etc/hosts

.

Comment out the non‑functional nameserver in

/etc/resolv.conf

or use a reliable one.

4. Read‑only file system error (MySQL)

Problem: MySQL fails to create a table, reporting "ERROR 1005 (HY000): Can't create table … (errno: 30)" which indicates a read‑only file system.

Possible causes:

File system corruption.

Bad disk sectors.

Incorrect

/etc/fstab

entries (e.g., wrong file‑system type).

Solution: Reboot the test machine or remount the file system; in some cases

mount -o remount,rw /dev/…

resolves the issue.

5. Deleted file does not free disk space

Problem:

df -h

shows 90 GB used, but

du -sh *

accounts for only 30 GB.

Cause: A process still holds an open file descriptor to a deleted file.

Solution:

Identify the offending process:

<code>/usr/sbin/lsof | grep deleted</code>

Terminate the process or close the descriptor, e.g.,

echo > /proc/25575/fd/33

.

Alternatively, truncate the file:

cat /dev/null > file

.

6. Improve performance of find cleanup script

Problem: A nightly

find

command that deletes old

picture_*

files causes high load.

Cause: Scanning a directory with many entries is resource‑intensive.

Solution: Use a more efficient shell pipeline:

<code>#!/bin/sh
cd /tmp
time=$(date -d "2 days ago" "+%b%d")
ls -l | grep "picture" | grep "$time" | awk '{print $NF}' | xargs rm -rf
</code>

7. Unable to obtain gateway MAC address

Problem: ARP fails to retrieve the MAC address of the gateway.

Solution:

Bind a static ARP entry:

arp -s 192.168.3.254 00:00:5e:00:01:64

8. HTTP service fails to start (port 7080)

Problem: Starting

httpd

reports "Address already in use" for port 7080.

Cause:

Port appears occupied;

netstat -npl | grep 7080

shows nothing.

The same port is defined in multiple configuration files.

Solution: Comment out the duplicate

Listen 7080

line in

/etc/httpd/conf.d/t.10086.cn.conf

and restart the service.

9. "Too many open files" error

Problem: System reports "too many open files".

Solution: Increase file descriptor limits:

<code>echo "" >> /etc/security/limits.conf
echo "* soft nproc 65535" >> /etc/security/limits.conf
echo "* hard nproc 65535" >> /etc/security/limits.conf
echo "* soft nofile 65535" >> /etc/security/limits.conf
echo "* hard nofile 65535" >> /etc/security/limits.conf

echo "" >> /root/.bash_profile
echo "ulimit -n 65535" >> /root/.bash_profile
echo "ulimit -u 65535" >> /root/.bash_profile
</code>

Then reboot or run

ulimit -u 65535 && ulimit -n 65535

.

10. ibdata1 and mysql‑bin logs consume disk space

Problem: Disk usage alarm;

ibdata1

>120 GB and

mysql‑bin

>80 GB.

Cause:

ibdata1

stores InnoDB tablespace and indexes in a shared file.

Binary logs accumulate over time.

Solution:

For oversized

ibdata1

, dump databases, delete the file, and recreate the tablespace.

To prune binary logs:

<code>mysql> PURGE MASTER LOGS TO 'mysql-bin.010';
mysql> PURGE MASTER LOGS BEFORE '2010-12-22 13:00:00';
</code>

Or set

expire_logs_days=30

in

/etc/my.cnf

for automatic cleanup.

Fault‑troubleshooting summary table

LinuxMySQLtroubleshootingshellSysadminCronDisk Management
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.