Operations 5 min read

Proper Server Reboot Practices: Lessons from a Kernel Upgrade Incident

After a painful P1 incident caused by a kernel upgrade, this article outlines how to properly reboot a Linux server by documenting pre‑restart state, verifying post‑restart consistency with process and port snapshots, and using simple commands such as ps, netstat, diff, and last to ensure service continuity.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Proper Server Reboot Practices: Lessons from a Kernel Upgrade Incident

《老何的1001夜》, Qunar colleagues understand

Triggered by an Incident

Upgrading a kernel caused a P1 incident, a painful lesson.

What Is a Proper Server Reboot?

A proper reboot means that after the server restarts, aside from the expected changes (e.g., kernel version upgrade), all external interfaces behave consistently and unchanged.

When You Know the Server's Users

If you know who uses the server, start with communication: ask the users what services run, list the processes, and after reboot verify the list matches.

When You Know Nothing About the Server

If you have no prior knowledge, follow this logical procedure:

Capture a snapshot of the system before reboot.

After reboot, compare the current snapshot with the pre‑restart snapshot.

If the two snapshots match, the reboot is successful.

If they differ, investigate the differences or consult the last logged‑in user.

How to Capture a System Snapshot

The simplest way is to record running processes and listening ports on Linux using the following commands:

sudo ps -ef | sort -k 8 > processlist
sudo netstat -an | grep LIST | sort -k 5 > portlist

Generally, a system snapshot consists of its process list and port list; if both are identical before and after reboot, the system state can be considered unchanged (application‑level issues are analyzed separately).

How to Compare System Snapshots

Use the diff command to see differences:

diff -c processlist_before_reboot processlist_after_reboot

Find Recent Users of the System

The last command shows recent logins. If you don't know who owns the server, run:

last

Review the most recent users; recognizing familiar usernames is a good habit.

Conclusion

The core principle remains unshakable: for any change, you must know the expected outcome and devise a method to verify that the result is correct.

He Weiping

Qunar / Travel Vacation Division – Search technology researcher and database researcher, translator of the first Chinese PostgreSQL manual and the third edition of Programming Perl. Focuses on search, distributed systems, databases, and cluster design. 18 years of IT experience.

operationsLinuxSystem AdministrationKernel Upgradeserver reboot
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.