Operations 10 min read

Troubleshooting systemd MySQL Service Hang Caused by Premature kill -9 in Forking Mode

This article analyzes an intermittent failure where a MySQL instance managed by systemd in forking mode hangs because an automated test prematurely kills the newly created mysqld process, leading to a missing MAIN PID and a zombie process, and provides investigation steps and a practical fix.

Aikesheng Open Source Community

May 9, 2024

Troubleshooting systemd MySQL Service Hang Caused by Premature kill -9 in Forking Mode

In an automated testing environment running MySQL 8.0.34 inside a CentOS 8 Docker container, the systemd service for mysqld sometimes hangs during startup. The service uses the forking start mode, and the ExecStart command launches /opt/mysql/base/8.0.34/bin/mysqld with appropriate options.

The symptom is a continuous hang with no output, and the MySQL error log shows no information. systemctl status reports that the service failed because the MAIN PID does not exist or is a zombie:

New main PID 31036 does not exist or is a zombie

Investigation revealed that after mysqld creates its mysqld.pid file, an automated test executes kill -9 $(cat /opt/mysql/data/11690/mysqld.pid), terminating the process before systemd can confirm the forked child. Consequently, systemd cannot locate the PID and treats the service as failed.

Key findings:

The mysqld.pid file exists, confirming the process was started.

The process is killed by the test case before systemd finishes its ExecStartPost phase. systemd then reports a missing MAIN PID, resulting in a zombie state.

To reproduce the issue, the service template was modified to add a sleep 10 after the ExecStart command, allowing a window to manually kill the process. The steps are:

Edit /etc/systemd/system/mysqld_11690.service and insert sleep 10 after the start command.

Run systemctl daemon-reload to apply changes.

Start the service in one SSH session and, in another session, kill the PID read from mysqld.pid as soon as the file appears.

Observe the same hanging behavior and the "New main PID ... does not exist or is a zombie" message.

The resolution is to kill the hanging systemctl start command, then issue systemctl stop mysqld_11690.service to let systemd clean up the zombie, and finally restart the service. This clears the stale MAIN PID and restores normal operation.

Although the article focuses on MySQL, the core lesson is a systematic approach to diagnosing intermittent service failures in Linux containers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux Container MySQL Service Systemd Forking

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.