Summary
win32PipeListener.Close in v0.6.2 can park indefinitely at pipe.go:578 (<-l.doneCh) when the listenerRoutine is stuck inside makeConnectedServerPipe waiting on an overlapped ConnectNamedPipe whose completion never arrives. Distinct from #281 — in our trace the closeCh <- 1 send succeeded; the wait on doneCh is what hangs.
Environment
github.com/Microsoft/go-winio v0.6.2
- Go 1.23,
GOOS=windows GOARCH=amd64
- Windows 11 Pro 26200
- Listener configured with
FirstPipeInstance: true + a custom SecurityDescriptor
Goroutine dump (excerpt)
goroutine 594 [chan receive]:
github.com/Microsoft/go-winio.(*win32PipeListener).Close(...)
.../go-winio@v0.6.2/pipe.go:578
Caller is our windowsListener.Close, which delegates to l.ln.Close() (the win32PipeListener). Accept() had been called and was running in a background loop at the time Close was invoked.
What we believe is happening
Walking the v0.6.2 source:
Close() sends on closeCh (pipe.go:577), then waits on <-l.doneCh (pipe.go:578).
listenerRoutine was inside makeConnectedServerPipe (because Accept() had been called and sent a responseCh on acceptCh).
- Inside
makeConnectedServerPipe's select, the case <-l.closeCh: branch fires. It calls p.Close() on the pending pipe (which should CancelIoEx the overlapped ConnectNamedPipe), then does err = <-ch to wait for the connectPipe goroutine to drain.
<-ch never returns — connectPipe's asyncIO is still parked on the overlapped completion. So makeConnectedServerPipe never returns, listenerRoutine never reaches close(l.doneCh), and Close() waits forever.
So the root question is: why doesn't p.Close() -> CancelIoEx unblock the pending ConnectNamedPipe overlapped IO in this scenario? A plausible trigger in our repro: a prior process (killed without graceful shutdown) had a client end of the canonical pipe path still open; pipe-namespace state interacts with CancelIoEx semantics in a way we haven't fully pinned down.
Reproduction
Not 100%, but reliably triggered with this preamble on our test box:
- Start a process A that creates a
winio.ListenPipe(path, ...) listener with FirstPipeInstance: true and runs Accept() in a goroutine.
- Kill A with
taskkill /F (no graceful Close) while a client is mid-dial.
- Start process B which does the same
ListenPipe on the same path. ListenPipe succeeds.
- In B, call
listener.Close() shortly after — it hangs in pipe.go:578.
In our case process B is a Go test (go test -run TestStartMCPListener_RejectsSecondStart); the test usually passes when run in isolation but hangs when run after sibling tests that touched the same canonical pipe path.
Workaround (downstream)
We've shipped a bounded select around l.ln.Close() in our wrapper: a goroutine does the inner Close, the outer select returns after 2s. The orphaned goroutine unparks naturally at process exit. Code: mekinney/recall#414.
Suggested directions for an upstream fix
We don't have a confirmed root-cause patch, but two directions seem worth considering:
- Bounded wait in
makeConnectedServerPipe's closeCh branch. After p.Close(), replace the unconditional err = <-ch with a select that gives up after a short deadline and lets listenerRoutine exit anyway. This may leak a goroutine in the pathological case but unblocks shutdown — preferable to an unkillable listener.
- Verify
CancelIoEx is doing what we expect in the connectPipe overlapped path when the pipe has a stale client end. If the cancel is racing against the kernel attaching the client, an explicit DisconnectNamedPipe before close may be needed.
Happy to provide a more isolated Go-only repro if helpful; the current path goes through a fair amount of our IPC scaffolding.
Refs
Summary
win32PipeListener.Closeinv0.6.2can park indefinitely atpipe.go:578(<-l.doneCh) when thelistenerRoutineis stuck insidemakeConnectedServerPipewaiting on an overlappedConnectNamedPipewhose completion never arrives. Distinct from #281 — in our trace thecloseCh <- 1send succeeded; the wait ondoneChis what hangs.Environment
github.com/Microsoft/go-winio v0.6.2GOOS=windows GOARCH=amd64FirstPipeInstance: true+ a customSecurityDescriptorGoroutine dump (excerpt)
Caller is our
windowsListener.Close, which delegates tol.ln.Close()(thewin32PipeListener).Accept()had been called and was running in a background loop at the timeClosewas invoked.What we believe is happening
Walking the v0.6.2 source:
Close()sends oncloseCh(pipe.go:577), then waits on<-l.doneCh(pipe.go:578).listenerRoutinewas insidemakeConnectedServerPipe(becauseAccept()had been called and sent aresponseChonacceptCh).makeConnectedServerPipe'sselect, thecase <-l.closeCh:branch fires. It callsp.Close()on the pending pipe (which shouldCancelIoExthe overlappedConnectNamedPipe), then doeserr = <-chto wait for theconnectPipegoroutine to drain.<-chnever returns —connectPipe'sasyncIOis still parked on the overlapped completion. SomakeConnectedServerPipenever returns,listenerRoutinenever reachesclose(l.doneCh), andClose()waits forever.So the root question is: why doesn't
p.Close()->CancelIoExunblock the pendingConnectNamedPipeoverlapped IO in this scenario? A plausible trigger in our repro: a prior process (killed without graceful shutdown) had a client end of the canonical pipe path still open; pipe-namespace state interacts withCancelIoExsemantics in a way we haven't fully pinned down.Reproduction
Not 100%, but reliably triggered with this preamble on our test box:
winio.ListenPipe(path, ...)listener withFirstPipeInstance: trueand runsAccept()in a goroutine.taskkill /F(no graceful Close) while a client is mid-dial.ListenPipeon the same path.ListenPipesucceeds.listener.Close()shortly after — it hangs inpipe.go:578.In our case process B is a Go test (
go test -run TestStartMCPListener_RejectsSecondStart); the test usually passes when run in isolation but hangs when run after sibling tests that touched the same canonical pipe path.Workaround (downstream)
We've shipped a bounded
selectaroundl.ln.Close()in our wrapper: a goroutine does the innerClose, the outerselectreturns after 2s. The orphaned goroutine unparks naturally at process exit. Code: mekinney/recall#414.Suggested directions for an upstream fix
We don't have a confirmed root-cause patch, but two directions seem worth considering:
makeConnectedServerPipe'scloseChbranch. Afterp.Close(), replace the unconditionalerr = <-chwith aselectthat gives up after a short deadline and letslistenerRoutineexit anyway. This may leak a goroutine in the pathological case but unblocks shutdown — preferable to an unkillable listener.CancelIoExis doing what we expect in theconnectPipeoverlapped path when the pipe has a stale client end. If the cancel is racing against the kernel attaching the client, an explicitDisconnectNamedPipebefore close may be needed.Happy to provide a more isolated Go-only repro if helpful; the current path goes through a fair amount of our IPC scaffolding.
Refs
closeChsend when noAcceptis in flight; ours hangs on the<-doneChwait after the send succeeded).