Skip to content

Fix MCP OAuth servers hanging on tool calls after session expiry#513

Open
rschmukler wants to merge 1 commit into
editor-code-assistant:masterfrom
rschmukler:rs/mcp-oauth-fix
Open

Fix MCP OAuth servers hanging on tool calls after session expiry#513
rschmukler wants to merge 1 commit into
editor-code-assistant:masterfrom
rschmukler:rs/mcp-oauth-fix

Conversation

@rschmukler

Copy link
Copy Markdown
Contributor

When an MCP server's OAuth session expired or was revoked server-side (e.g. Linear), tool calls would hang for the full 120s timeout and never recover, instead of detecting the broken session and re-authenticating.

Root cause was in the Streamable HTTP transport: when the server returned a non-200 (401/403/404/5xx) whose body was not already a valid JSON-RPC message, plumcp synthesized a JSON-RPC error with no :id. Because the client correlates responses to pending requests by :id, that error was routed to the notification handler (and dropped), so the in-flight tool call's :on-error never fired and the call blocked until timeout — even though the HTTP status was known immediately.

Fixes:

  • Bump plumcp to 0.2.2, which stamps the originating request's :id onto synthesized HTTP errors. The error now reaches the pending call's :on-error promptly with :plumcp.core/http-status, so the existing reinit-worthy-http-status? path flags the server for re-initialization (token refresh, or browser re-auth when the refresh token is also dead) instead of hanging.

  • Proactively refresh a locally-expired OAuth token before issuing a tool call, so a known-expired token skips the doomed request and full reinit entirely.

  • I added a entry in changelog under unreleased section.

  • This is not an AI slop.

When an MCP server's OAuth session expired or was revoked server-side
(e.g. Linear), tool calls would hang for the full 120s timeout and never
recover, instead of detecting the broken session and re-authenticating.

Root cause was in the Streamable HTTP transport: when the server returned
a non-200 (401/403/404/5xx) whose body was not already a valid JSON-RPC
message, plumcp synthesized a JSON-RPC error with no :id. Because the
client correlates responses to pending requests by :id, that error was
routed to the notification handler (and dropped), so the in-flight
tool call's :on-error never fired and the call blocked until timeout —
even though the HTTP status was known immediately.

Fixes:

- Bump plumcp to 0.2.2, which stamps the originating request's :id onto
  synthesized HTTP errors. The error now reaches the pending call's
  :on-error promptly with :plumcp.core/http-status, so the existing
  reinit-worthy-http-status? path flags the server for re-initialization
  (token refresh, or browser re-auth when the refresh token is also dead)
  instead of hanging.

- Proactively refresh a locally-expired OAuth token before issuing a tool
  call, so a known-expired token skips the doomed request and full reinit
  entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant