Why Do HTTP Keep-Alive Connections Trigger ECONNRESET and How to Fix Them
This article examines why occasional ECONNRESET errors occur in HTTP RPC keep‑alive connections, explains the underlying TCP and OS mechanisms, and compares several practical mitigation strategies—including retries, connection pre‑discard, idempotent handling, and protocol upgrades such as HTTP/2 and gRPC—to help developers choose the most reliable solution.
Many services using HTTP RPC encounter occasional
ECONNRESETerrors caused by keep‑alive connections being closed by the server just as the client sends a new request.
Cause Analysis
Servers often enable keep‑alive to reuse sockets, but the idle timeout may close the socket without a graceful shutdown. When the client’s next request uses the now‑closed socket, the server’s TCP stack returns a reset packet, resulting in
ECONNRESET.
<code>const http = require("http");
const agent = new http.Agent({ keepAlive: true });
// server with 5 s idle timeout
http.createServer((req, res) => {
res.write("foo\n");
res.end();
}).listen(3000);
setInterval(() => {
http.get("http://localhost:3000", { agent }, res => {
res.on("data", data => {});
});
}, 5000);
</code>The client may not receive the socket‑close event before the next request is issued, so it sends data on a closed socket and receives
ECONNRESET.
Error Handling
Simply retrying the request is not always sufficient; the retry should only occur when the request reused a previously established socket:
<code>if (self.req._reusedSocket && error.code === 'ECONNRESET') {
self.agent = {addRequest: ForeverAgent.prototype.addRequestNoreuse.bind(self.agent)};
self.start();
self.req.end();
return;
}
</code>Chromium implements a similar check to avoid infinite resend loops:
<code>bool HttpNetworkTransaction::ShouldResendRequest() const {
bool connection_is_proven = stream_->IsConnectionReused();
bool has_received_headers = GetResponseHeaders() != nullptr;
if (connection_is_proven && !has_received_headers)
return true;
return false;
}
</code>Linux generates
ECONNRESETwhen a TCP RST is received in most states; other states produce different errors such as
ECONNREFUSED.
<code>void tcp_reset(struct sock *sk) {
switch (sk->sk_state) {
case TCP_SYN_SENT: sk->sk_err = ECONNREFUSED; break;
case TCP_CLOSE_WAIT: sk->sk_err = EPIPE; break;
case TCP_CLOSE: return;
default: sk->sk_err = ECONNRESET;
}
}
</code>Retry Strategies
Node.js
requestlibrary retries only when the socket was reused. Golang’s HTTP client retries idempotent requests and supports an
X-Idempotency-Keyheader for explicit idempotency.
Proactive Connection Discard
Frameworks like EggJS suggest discarding a keep‑alive socket slightly before the server’s timeout, based on a response header that conveys the timeout value.
Half‑Close Considerations
Node.js destroys idle sockets instead of half‑closing them; replacing
destroy()with
end()still leads to a “socket hang up” error because a half‑closed server cannot send data.
Protocol Improvements
gRPC, built on HTTP/2, uses periodic PING frames to keep connections alive, avoiding idle‑timeout resets.
HTTP/2 introduces mechanisms such as GOAWAY and REFUSED_STREAM that allow safe retry of streams that were not processed.
Conclusion
For HTTP/1.1 RPC, three practical mitigations exist: retry on reused sockets, proactive socket discard, and moving to more robust protocols like HTTP/2 or gRPC. Choosing the right approach depends on the service’s tolerance for duplicate or lost requests and the feasibility of protocol migration.
Jike Tech Team
Article sharing by the Jike Tech Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.