fix: gracefulShutdown awaits async cleanup after server.close drain (v0.0.4)#7
Merged
Merged
Conversation
…v0.0.4)
The pre-existing handler ran cleanup synchronously *before* server.close,
which silently dropped any Promise returned by cleanup. The process then
exited before async resource releases (Redis disconnects, DB pools, file
handles) could complete.
New shape:
1. SIGTERM/SIGINT triggers handler
2. server.close() drains in-flight requests
3. server.closeAllConnections() drains keep-alive sockets (Node >= 18.2.0)
4. await cleanup?.() — caught and logged on rejection, does not abort exit
5. process.exit(0)
Signature widening: cleanup type goes from () => void to
() => void | Promise<void>. This is structurally backward-compatible —
every previous caller's cleanup shape still satisfies the new union.
Cleanup errors are caught with console.error("gracefulShutdown: cleanup
error", err) and do NOT abort process.exit(0). The contract is
"best-effort cleanup, then guaranteed exit" — operators relying on
hung-cleanup-blocks-exit semantics had no such guarantee in the previous
implementation either (because cleanup was synchronous, exceptions would
have crashed the handler before exit).
Tests: 4 new RED → GREEN tests (call order, async await, error logging,
closeAllConnections invocation) + 2 existing tests updated to be async
and mock closeAllConnections. All 27 tests pass; lint, typecheck, and
build all green.
Spec: .claude/superpowers/specs/2026-05-05-d5-builder-context-lifecycle.md
(cross-repo gracefulShutdown semantics fix section, lines 475-520).
Cross-repo coordination:
This release is a publish prerequisite for o3co/auth.provider v0.5.2 —
the standalone template's gracefulShutdown(server, () => handle.dispose())
call was already passing an async cleanup that the previous sync impl
silently dropped; v0.0.4 properly awaits it before process.exit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the gracefulShutdown utility to run cleanup only after server.close() completes, await async cleanup, and add keep-alive connection draining behavior; it also expands test coverage and documents the behavior change for the upcoming v0.0.4 release.
Changes:
- Run
cleanupinside theserver.close()callback andawaitit (with error logging) before callingprocess.exit(0). - Invoke
server.closeAllConnections()during shutdown to address lingering keep-alive sockets. - Add/adjust Vitest coverage for ordering, async cleanup awaiting, error logging, and
closeAllConnections()usage; add a v0.0.4 changelog entry.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/shutdown.mts | Moves cleanup into the server.close() callback, awaits it, logs cleanup errors, and calls closeAllConnections(). |
| src/tests/shutdown.test.mts | Adds new tests for shutdown ordering/awaiting/error logging and updates existing tests to account for the new ordering and closeAllConnections(). |
| CHANGELOG.md | Documents the v0.0.4 behavior change, including awaited cleanup and keep-alive draining. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ines)
Round 1 multi-agent-review converged on a Critical issue + 2 Important
items. This fixup addresses C1+C2+C3+C4+C6.
C1 (Critical, Codex P1 + Claude I-2 convergence) — replace
`server.closeAllConnections()` with `server.closeIdleConnections()`.
Per Node.js docs:
closeAllConnections — "Closes all established HTTP(S) connections...
including active connections connected to this server which are
sending a request or waiting for a response."
closeIdleConnections — "Closes all connections... which are not
sending a request or waiting for a response."
The original spec called for `closeAllConnections`, but that destroys
in-flight request connections too — exactly the regression a graceful
shutdown is meant to avoid. `closeIdleConnections` is the correct
keep-alive-reaping primitive.
(Note: Node.js >= 19.0.0 reaps idle keep-alives in `server.close`
itself, so the explicit call is harmless there and necessary on 18.x.)
C2 (Important, Claude I-1) — fix CHANGELOG v0.0.3 release date
2026-04-18 → 2026-04-14 (verified against tag commit df1b2ed).
C3 (Important, Claude I-2 paragraph 1) — rewrite the CHANGELOG
"new shape" narrative so it reflects the actual concurrent execution:
`server.close(...)` is registered with an async callback while
`server.closeIdleConnections()` runs synchronously in parallel.
The previous numbered-list framing implied sequential 1→2→3→4→5 which
mismatched the implementation.
C4 (Minor, Claude M-1) — declare `engines.node >= 18.19.0` in
package.json. Matches the auth.provider consumer floor, fails fast on
pre-18.2 installers where `closeIdleConnections` is undefined.
C6 (Minor, Claude M-3) — rename "calls cleanup function when handler is
invoked" test to "works with a synchronous cleanup function" so it
clearly differentiates from the new "invokes cleanup inside server.close
callback (after in-flight requests drain)" ordering test, and rename
"calls server.closeAllConnections to drain keepalive sockets" to
"calls server.closeIdleConnections to drain idle keep-alive sockets
without aborting in-flight requests" for the same clarity reason.
Deferred (per user-confirmed scope):
- C5 M-2 JSDoc — pre-existing absence; out of scope for this PR
- C7 M-4 SIGINT behavioral test — structural argument suffices (same
handler reference is registered for both signals)
- C8 M-5 cleanup-hang timeout — would be a separate contract decision,
out of scope
Verification: 27/27 tests pass, typecheck/lint/build all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sion defense) Round 2 multi-agent-review (Codex 0 / Claude 3 Minor) — apply all 3. M-1 (Claude) — CHANGELOG line 50 reworded from "in parallel with server.close()" to "synchronously after server.close() registers its callback" so the Added section header matches the lifecycle described in the Changed section above (which already gets it right). M-2 (Claude) — add a negative `expect(closeAllSpy).not.toHaveBeenCalled()` assertion to the closeIdleConnections drain test. Defends against a future contributor flipping back to the pre-Round 1 spec premise where closeAllConnections aborts in-flight requests; the test suite now fails loudly instead of passing silently if that regression is reintroduced. M-3 (Claude) — rename test variable closeAllSpy → closeIdleSpy in the 6 sites that mock closeIdleConnections. The old name was a leftover from the closeAllConnections shape and read inconsistently with the production API. Verification: 27/27 tests pass, lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round 3 Copilot review: 10 threads. 7 stale (already addressed by Round 1 fixup 3641ba7 — Copilot's first review was on the pre-fixup commit). 3 new: T3 (Important) — idempotent guard for repeated signal delivery. RED test "is idempotent under repeated signal delivery" added first (asserts 3 handler() calls → cleanup/close/closeIdle/exit each invoked exactly once). Pre-impl: failed with "called 3 times". Post-impl: passes. Implementation: a closure-local `shuttingDown` flag short-circuits the second handler() onward, plus `process.removeListener` for both SIGTERM and SIGINT on first entry. Operators sending multiple SIGTERMs (k8s pre-SIGKILL retry, Ctrl+C-spam) no longer cause duplicated cleanup or multiple process.exit(0). (As a side benefit, removeListener also fixes a pre-existing test-order-pollution flake where the previous "works without cleanup function" test occasionally observed a stale handler on the real process due to listeners not being torn down between tests.) T4 + T10 (Minor, dupe across two Copilot reviews) — rename test helper `flushMicrotasks` → `flushEventLoop`. The helper uses `setImmediate`, which queues macrotasks (event-loop turns), not microtasks. Pure naming clarity. Stale 7 (resolved as already-addressed): - T1, T7: closeAllConnections feature-detect → covered by R1's closeIdleConnections + engines.node - T2, T8: closeAllConnections aborts in-flight → R1 fixed it - T5, T9: CHANGELOG order mismatch → R1 rewrote CHANGELOG narrative - T6: engines.node not declared → R1 added engines.node ≥ 18.19.0 Verification: 28/28 tests pass (27 prior + 1 new RED→GREEN), lint clean, typecheck clean, build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merged
6 tasks
y1o1
added a commit
to o3co/auth.provider
that referenced
this pull request
May 8, 2026
… block (#145) - Bump `@o3co/auth.utils` minimum in `templates/standalone/package.json` from `^0.0.3` to `^0.0.4` to pick up the cross-repo `gracefulShutdown` rewrite (cleanup awaited inside `server.close` callback, signature widened to `() => void | Promise<void>`, idempotent under repeated SIGTERM/SIGINT, uses `closeIdleConnections` not `closeAllConnections`). - Convert CHANGELOG `[Unreleased]` → `[0.5.2] — 2026-05-09` so the release tag captures all of F1–F9 + SC residual + this auth.utils bump as one v0.5.2 train. - Round 1 fixup: replaced 37 hard-coded `v0.5.1` references inside the `[0.5.2]` block with `v0.5.2` so the migration narrative is internally consistent (the [0.5.2] block authored entries had inline v0.5.1 markers from when these were intended to ship as v0.5.1). - Round 2 fixup (Copilot): renamed `### Dependencies` to `### Changed (auth.utils dependency bump, v0.5.2)` to match Keep a Changelog standard taxonomy. Cross-repo: `o3co/auth.utils#7` merged at `2a5241cf` and `@o3co/auth.utils@0.0.4` published on npm at 2026-05-08T19:46:34Z via Trusted Publisher OIDC. Why v0.5.2 (not v0.5.1): v0.5.1 was tagged early at commit `04132c69` as the `create-auth-provider` ENOENT hotfix off `v0.5.0`; the F-series module-system redesign + supply-chain residuals + this dep bump all ship as v0.5.2.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
server.close()callback (after in-flight requests drain) and is awaited, instead of the previous synchronous fire-before-close shape that silently dropped any returnedPromise.cleanupparameter type widens from() => voidto() => void | Promise<void>— structurally backward-compatible.console.error("gracefulShutdown: cleanup error", err);process.exit(0)still runs.server.closeAllConnections()invoked alongsideserver.close()to drain keep-alive sockets (Node.js >= 18.2.0).Spec:
auth/.claude/superpowers/specs/2026-05-05-d5-builder-context-lifecycle.md— cross-repogracefulShutdownsemantics fix section.This release (v0.0.4) is a publish prerequisite for
o3co/auth.providerv0.5.2 — thetemplates/standalone/src/app.mts:117callgracefulShutdown(server, () => handle.dispose())was already passing an asyncdispose()that the previous sync impl silently dropped. With this fix, the lifecycle drain inauth.providerv0.5.x is properly awaited before the process exits.TDD evidence
invokes cleanup inside server.close callback (after in-flight requests drain)— verifies call order["close", "cleanup", "exit"]awaits an async cleanup before exiting— verifiesprocess.exitonly fires aftercleanup's Promise resolveslogs cleanup errors via console.error and still exits— verifies error path is caught + loggedcalls server.closeAllConnections to drain keepalive socketsasync+closeAllConnectionsmock + microtask flush) to keep working under the new ordering.Test plan
pnpm test— all 27 tests pass (5 test files)pnpm typecheck— 0 errorspnpm build— cleanpnpm lint— clean (after biome--writeformatting pass)Migration
The signature widening is structurally backward-compatible — every previous
cleanup: () => voidshape still satisfies() => void | Promise<void>. No code change required at call sites unless callers want to take advantage of awaited async cleanup. Behavior change: cleanup now runs after request drain and is awaited; fire-and-forget ordering is gone.🤖 Generated with Claude Code