Why Does TIME_WAIT Accumulate in High‑Concurrency Scenarios and How to Fix It?
Under high‑concurrency loads, massive TIME_WAIT TCP connections can exhaust local ports, causing “address already in use” errors; this article explains the root causes, impact on services, and practical mitigation strategies such as keep‑alive headers, socket reuse, and reducing the TIME_WAIT timeout.
In simulated high‑concurrency scenarios many TCP connections enter the TIME_WAIT state, temporarily consuming local ports.
After a short period these connections are reclaimed, but in sustained traffic new TIME_WAIT connections keep appearing, and in extreme cases large numbers accumulate.
Think: What business impact can a massive amount of TIME_WAIT connections have?
When Nginx acts as a reverse proxy, numerous short‑lived connections can cause many sockets to stay in TIME_WAIT, each occupying a local port (max 65535). If many sockets are in TIME_WAIT, new connections may fail with “address already in use”.
How to Inspect TCP Connection States
<code>// Count connections by state
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 1154
TIME_WAIT 1645
</code>Tip: The local port limit is 65535 because the TCP header uses a 16‑bit field for the port number.
Root Causes of Excessive TIME_WAIT
Large number of short connections.
HTTP requests with
Connection: closecause the server to actively close the TCP connection.
TCP’s four‑way handshake keeps the side that initiates the close in TIME_WAIT for twice the Maximum Segment Lifetime (MSL).
What TIME_WAIT Means
The side that actively closes the connection enters TIME_WAIT after receiving the FIN and sending the final ACK.
The state lasts for 2 × MSL (typically 4 minutes, as MSL is 2 minutes).
Mitigation Strategies
Clients should use
Connection: keep-aliveto keep connections alive.
Servers can allow reuse of sockets in TIME_WAIT and reduce the TIME_WAIT timeout to 1 MSL (≈2 minutes).
Key Takeaways
TIME_WAIT occurs on the side that initiates the close.
Default TIME_WAIT duration is 2 × MSL (usually 2 minutes × 2 = 4 minutes).
Ports held by TIME_WAIT cannot be reused until the timeout expires.
The total number of local ports is limited to 65535.
Excessive TIME_WAIT can cause “address already in use” errors for new connections.
Appendix
Querying TCP Connection States (macOS)
<code>// List TIME_WAIT connections
$ netstat -nat | grep TIME_WAIT
// List TIME_WAIT and local address
$ netstat -nat | grep -E "TIME_WAIT|Local Address"
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 127.0.0.1.1080 127.0.0.1.59061 TIME_WAIT
// Count connections by state
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 1154
TIME_WAIT 1645
</code>Maximum Segment Lifetime (MSL)
MSL is the maximum time a packet can exist in the network before being discarded. RFC 793 defines MSL as 2 minutes, though implementations often use 30 seconds, 1 minute, or 2 minutes.
TCP Handshakes
Three‑way handshake establishes a connection; four‑way handshake terminates it, leaving the initiator in TIME_WAIT.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.