fix: graceful TCP close to stop RST on early error responses (#185)#389
Merged
Conversation
SLEdge frequently answers a request before it has finished reading it: the short-circuit error responses (400/404/429/500/503) in the listener thread write the response and close the socket while the client may still be sending its request body. tcp_session_close() did a bare close(). On Linux, close() with unread data still in the kernel receive buffer discards that data and emits a TCP RST instead of a graceful FIN. The client's HTTP stack reports this as "connection reset by peer" or, on a pooled keepalive connection, as the Go net/http warning "Unsolicited response received on idle HTTP channel" that issue #185 describes. Close gracefully instead: half-close the write side so the client receives our FIN (and, holding a complete response, stops sending), then drain any already-buffered inbound data so close() no longer sees unread data. The drain is non-blocking and bounded (32 x 8 KiB) so the single listener thread can never block or spin on a slow client. Verified in the Docker dev environment with hey: POSTing a 100 KB body to a rejected (404) route went from ~159 "connection reset by peer" errors per 2400 requests to 0, with no change to normal 200-response throughput (~15k rps) or the bodyless GET -> 429/503 path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
138add3 to
9e66463
Compare
emil916
reviewed
Jun 20, 2026
emil916
left a comment
Contributor
There was a problem hiding this comment.
I've never run into this issue. The server doesn't send any of the mentioned error responses (400/404/429/500/503), unless the request is fully received, no? I don't see how this is ever useful.
bushidocodes
added a commit
that referenced
this pull request
Jun 20, 2026
Self-contained script that reproduces the "connection reset by peer" / "Unsolicited response received on idle HTTP channel" failure behind #185: launches sledgert with a single valid route, then POSTs a sizeable body to a non-existent route so the server emits a 404 (matched at the request line, in on_client_request_receiving) and close()s while the body is still in flight, producing a RST. Reports the reset count and prints BUG REPRODUCED when nonzero. Helps reproduce the concrete failure case requested in PR #389 review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Self-contained script that reproduces the "connection reset by peer" / "Unsolicited response received on idle HTTP channel" failure behind #185: launches sledgert with a single valid route, then POSTs a sizeable body to a non-existent route so the server emits a 404 (matched at the request line, in on_client_request_receiving) and close()s while the body is still in flight, producing a RST. Reports the reset count and prints BUG REPRODUCED when nonzero. Run on an unpatched checkout to see the bug; on this branch the graceful-close fix drops the reset count to 0. Helps reviewers reproduce the concrete failure case requested in PR #389 review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
I see. I remember now, I used to get these 503 errors back in 2022 when using hey (now I am pretty much using my own loadtest). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #185
Summary
Fixes the root cause behind #185 —
heyloggingUnsolicited response received on idle HTTP channel starting with "HTTP/1.1 503 ...".SLEdge often answers a request before it has finished reading it: the short-circuit error responses (400/404/429/500/503) in the listener thread write the response and
close()the socket while the client may still be sending its request body.tcp_session_close()did a bareclose(), and on Linux aclose()with unread data in the kernel receive buffer discards it and emits a TCP RST instead of a graceful FIN. The client surfaces that RST asconnection reset by peeror, on a pooled keepalive connection, as the exactUnsolicited response received on idle HTTP channelwarning in the issue. This matches the maintainers' own hypothesis in the issue thread ("we write a response before reading the entire HTTP request").Change
tcp_session_close()now closes gracefully:shutdown(SHUT_WR)— send our FIN so the client (now holding a complete response) stops sending.close()never sees unread data, while guaranteeing the single listener thread can't block or spin on a slow/malicious client.One file, +41 lines, no change to the success path.
Verification
Reproduced and verified in the Docker dev environment with
hey:connection reset by peer)Note: the originally reported bodyless-GET scenario is already clean on current
master(admissions/scheduler 429s now fire after the full request is read). The reproducible residual was a request with a body hitting an early-error route, which this fixes.Known limitation
A client that keeps streaming a large body to a rejected route may still observe its own write abort as
broken pipe(EPIPE). Fully suppressing that would require lingering on the request in the event loop (a draining state in the listener), which is disproportionate for this log-level issue — left out of scope and documented in the code comment.🤖 Generated with Claude Code