Deep Dive into MySQL KILL Command: Execution Flow, Thread States, and Root Causes of Prolonged Killed Status
This article analyzes why MySQL connections remain in a KILLED state after a KILL command, detailing the internal execution flow, thread‑state handling, signal‑based wake‑ups, and reproducing cases that expose rollback, lock‑wait, and concurrency issues that delay connection termination.
Background : The MySQL KILL command does not always terminate a connection instantly; sometimes the thread stays in a KILLED state for a long period. The article investigates the MySQL 5.7 code path for KILL and explains the underlying reasons.
Execution Flow Overview : MySQL starts in mysqld::main , creates a listening thread, accepts TCP connections, and builds a THD (thread descriptor) for each client. When a user issues KILL <connection_id> , the client thread sends the command to the server, which looks up the target connection and invokes sql_kill() → kill_one_thread() → THD::awake() .
KILL Workflow :
1. Set the thread's killed flag (unless the thread is idle and the request is KILL QUERY ).
2. Close the socket and interrupt any engine‑level waits ( shutdown_active_vio() , ha_kill_connection() ).
3. Broadcast a condition variable to wake threads waiting on current_cond .
4. The target thread either detects the flag at an active check point or is awakened by the broadcast and proceeds to clean up.
Thread States and Wake‑up Mechanisms :
connection : the socket connection.
pthread : the OS thread handling the connection.
THD : MySQL’s thread descriptor; its killed field signals termination.
Two main ways a killed thread can react:
Active check points : many code paths (e.g., SELECT row fetch, InnoDB index reads) periodically test thd->killed and abort.
Signal‑based wake‑up : the killer thread broadcasts current_cond , waking a thread blocked in a lock or condition wait.
Root Cause Summary : A thread may stay KILLED because it is rolling back a large transaction, cleaning temporary tables, or waiting for a lock while the server has not yet reached a check point. High server load can also delay the check‑point or the signal delivery.
Reproduction Cases :
Case 1: Simple kill when the target connection has already finished its command.
Case 2: Kill during the parse phase before InnoDB entry.
Case 3: InnoDB‑level wait (row lock) – killing the connection triggers ha_kill_connection() which sets the lock’s event and wakes the waiting thread.
Case 4: MySQL‑level condition wait – the killer broadcasts the same condition object, waking the target.
Each case includes stack traces (preserved in ... blocks) that show the exact functions involved, such as trx_is_interrupted , row_search_mvcc , and lock_wait_suspend_thread .
Real‑World Incident : In production, killing many connections under heavy load caused the connection count to rise because the client library immediately re‑connected after the socket was closed, while the server‑side threads remained in KILLED state awaiting rollback or lock release. The article explains how innodb_thread_concurrency limits and missing killed‑state checks in the srv_conc_enter_innodb loop exacerbate the problem.
Takeaways :
Closing the socket first can cause rapid client reconnection, inflating connection counts.
Long‑running rollbacks or lock waits keep threads in KILLED state.
Ensuring timely check‑points or using proper signal broadcasting is essential for reliable KILL behavior.
Yuanfudao Tech
Official Yuanfudao technology account, using tech to empower education development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.