Skip to content

feat: MCP 2026-07-28 stateless-core support#23

Draft
dbernheisel wants to merge 27 commits into
dbern/add-test-helpersfrom
dbern/mcp-2026-07-28-stateless-core
Draft

feat: MCP 2026-07-28 stateless-core support#23
dbernheisel wants to merge 27 commits into
dbern/add-test-helpersfrom
dbern/mcp-2026-07-28-stateless-core

Conversation

@dbernheisel

@dbernheisel dbernheisel commented May 25, 2026

Copy link
Copy Markdown
Owner

Initial support for the MCP 2026-07-28 "stateless core" protocol while keeping all earlier protocol versions working unchanged. Same handler source code runs on both: no if protocol_version branches anywhere in user-land, and existing legacy callers don't need to change anything.

The dev's only adoption step for supporting the new protocol is adding :secret_key_base to their router — used by Phantom.RequestState to encrypt the multi-round-trip requestState blob with Plug.Crypto

What's new

  • Phantom.RequestStatePlug.Crypto-backed encode/decode for the opaque requestState continuation
  • Phantom.Tool.input_required/1/2 — result builder for the new inputRequired shape
  • Phantom.Request.with_cache/2ttlMs / cacheScope cache hints per the RC
  • Phantom.Request.trace_context/1 — W3C traceparent / tracestate / baggage extracted from _meta; surfaces on the [:phantom, :dispatch] telemetry span
  • Phantom.Plug.read_transport_headers/1mcp-protocol-version / mcp-method / mcp-name for L7 routing

Session.elicit/3 — protocol-aware defaults

Phantom.Session.elicit/3 has two modes, picked by opts with protocol-aware defaults:

For earlier MCP clients, they default to await: true, and await: false is will be ignored.

For newer MCP clients, they default to await: false.

Call Legacy Stateless
Session.elicit(session, elicitation) inline blocking (unchanged) re-entry, state: nil
Session.elicit(session, elicitation, await: true) inline blocking inline blocking via Task suspension
Session.elicit(session, elicitation, state: x) (not relevant) re-entry
  • Inline blocking — returns {:ok, response}. Under legacy, blocks via the open SSE stream. Under stateless, suspends the tool's Task and resumes inline when the follow-up tools/call arrives (possibly on a different node).
  • Re-entry — returns {:input_required, elicit, state, session}. The dispatcher converts to an inputRequired result (stateless) or runs the SSE elicit round-trip + handler re-invocation (legacy). The handler is re-entered with session.state populated.

Existing legacy code that pattern-matches {:ok, response} against Session.elicit(session, elicit) keeps working unchanged because legacy's no-opts default is still inline blocking. Code that wants to support stateless inline either picks the mode explicitly with await: true or migrates to the re-entry pattern with state:.

Other dev-facing changes

  • Phantom.Router accepts :secret_key_base (required for 2026-07-28 support)
  • Under stateless, session.client_info and session.client_capabilities populate from _meta per-request (no initialize handshake exists under the new protocol)
  • Tool and prompt dispatch now always run in a spawned Task. Handler crashes are isolated to the Task — the HTTP/session process keeps serving — and surface via the [:phantom, :dispatch, :exception] telemetry event with {kind, reason, stacktrace, method, params, request, session} metadata

Distribution

Both elicitation patterns preserve "any node can serve any call":

  • Re-entry: state is encrypted into requestState; no cross-node state at all
  • Inline await: the suspended Task lives on the original node; a follow-up on any node looks up the task pid via Phantom.Tracker and Erlang's distributed-pid runtime delivers the message cross-node.

Additive groundwork for the new stateless-core dispatch mode; no
existing behavior is changed yet. Subsequent work wires the
inputRequired flow through the dispatcher.

- Phantom.ProtocolVersion: single source of truth for supported
  versions plus mode/1 (:legacy | :stateless_core | :unsupported)
- Phantom.Request gains a :meta field carrying params._meta
  (protocolVersion, clientInfo, capabilities, traceparent,
  tracestate, baggage, requestState)
- Phantom.Plug.read_transport_headers/1 captures
  mcp-protocol-version / mcp-method / mcp-name onto
  conn.private.phantom.transport
- Phantom.RequestState: Plug.Crypto-backed encode/decode for the
  opaque requestState continuation blob (no Phoenix dep)
- Phantom.Router accepts :secret_key_base, surfaced on __phantom__(:info)
- Phantom.Tool.input_required/1 builder + Tool.response/1
  passthrough for the inputRequired shape
Session.elicit/3 now accepts a `state:` option. When provided, the call
returns `{:input_required, elicit, state, session}` instead of blocking.
Phantom.Router catches that tuple in run_tool/4 and dispatches per
protocol:

- 2026-07-28: builds a Tool.input_required result, encrypts the state
  via Phantom.RequestState, returns it on the same POST. The next call
  decodes the blob onto `session.state` so the handler resumes via a
  pattern match in its function head.

- Legacy: performs the existing SSE elicitation/create round-trip,
  merges the response into params, sets `session.state`, and
  re-invokes the handler. Multi-step elicits recurse naturally.

Without `state:`, Session.elicit/3 keeps its existing blocking
behavior — no change for legacy callers. The dev's handler is
identical across protocols; no `if protocol_version == ...` and no
`throw`/`catch` for control flow.

Adds Phantom.Elicit.to_input_requests/1, renames the Tool.input_required
option from `:request_state` to `:state` for symmetry with the new
`session.state` field, and adds protocol-agnostic stateless_core_test
coverage.
Phantom.Request.with_cache/2 — annotates any JSON-RPC result with
top-level ttlMs and cacheScope per the RC spec. These are advisory
hints for the client and any intermediate CDN/gateway; the server
does not cache. Works on tool, resource, prompt, list, and
completion results alike.

Phantom.Request.trace_context/1 — extracts W3C traceparent /
tracestate / baggage from _meta and exposes them as a normalized
map. The [:phantom, :dispatch] telemetry span carries this on its
metadata so OpenTelemetry consumers can continue an upstream trace
through Phantom without extra wiring.
Two Bandit nodes share only :secret_key_base. Node 1 receives a
tools/call and returns inputRequired with the encrypted requestState
blob; node 2 (no shared memory, no Tracker entry, no session header)
decodes the blob and runs the same tool's resume clause to
completion. This locks in the distributed-by-default claim for the
new stateless core.

Adds a resume_tool to Test.MCP.Router that demonstrates the protocol-
agnostic Session.elicit(..., state: ...) pattern, plus configures
:secret_key_base for the test fixture.
CHANGELOG gets an Unreleased section enumerating the new public
surface (Phantom.ProtocolVersion, Phantom.RequestState, the
:secret_key_base option, Session.elicit/3 with :state,
Tool.input_required/1, Request.with_cache/2,
Request.trace_context/1, Plug.read_transport_headers/1).

README gains an "Eliciting input" section with the protocol-
agnostic Session.elicit/3 pattern and pattern-matched resume
clause. The existing "Persistent Streams" section now flags that
it applies to legacy protocols only, and the supported-methods
list notes that elicitation/create's SSE round-trip is replaced
by inputRequired + encrypted requestState under 2026-07-28.

Phantom.Router, Phantom.Session.elicit/3, Phantom.Elicit, and
Phantom.Plug moduledocs surface the dual-mode behavior and the
new options/return shapes.
Existing Session.elicit/3 reverts to its original inline-blocking
behavior. Under MCP 2026-07-28 (where there is no persistent stream
to block on) the function raises with a clear pointer to the new
helper.

Phantom.Session.request_input/3 is the explicit re-entry helper.
Returns {:input_required, elicit, state, session}; the dispatcher
converts that to a Tool.input_required result under stateless or
performs the SSE round-trip + handler re-invocation under legacy.

Updates README and module docs to show both patterns side-by-side
and explain when to pick each. Tests, fixture router, and CHANGELOG
follow the new naming.

A follow-up Phase B will add Session.elicit(..., await: true) — a
protocol-agnostic inline form backed by Task-based continuation
that survives across HTTP requests under stateless.
Tools/call handlers now always run in a spawned Task, using the
existing {:noreply, session} + Session.respond async pattern.
Lays the foundation for Phase B-2, where Session.elicit(...,
await: true) will use a message protocol between the task and
its current HTTP-handler adopter for protocol-agnostic inline
awaiting.

Behavior changes:

- Tool crashes are isolated to the Task. The HTTP/session process
  no longer terminates; the client receives a JSON-RPC -32603
  error and the session keeps serving. The crash is surfaced via
  the [:phantom, :dispatch, :exception] telemetry event with
  metadata {kind, reason, stacktrace, method, params, request,
  session}.

- Session.respond_error/3 is a new public API to send a JSON-RPC
  error response from an async task. Internally it casts the same
  {:respond, ...} message used by Session.respond/2; the payload
  shape distinguishes success (result key) from error (error key).

- Phantom.Plug's existing legacy elicit closure now serves the
  task via the cross-process pattern. Existing async_elicit_tool-
  style handlers continue to work via GenServer.call to the
  idle session GenServer.

Tests previously asserting {:EXIT, pid, exception} for crashed
tools updated to attach a telemetry handler and assert the
[:phantom, :dispatch, :exception] event. assert_exception_response
helper is unchanged for callers that still want the old contract,
but the project's own tests no longer rely on it.

The stateless-core unit tests use a small assert_responded/1
helper that pattern-matches on the {:"$gen_cast", {:respond, ...}}
message the Task sends to session.pid (the test process).
Under MCP 2026-07-28 (stateless core), Session.elicit/3 with `await:
true` suspends the tool's Task and returns an inputRequired result
to the client. The follow-up tools/call request — possibly on a
different node behind a plain round-robin LB — decodes an opaque
ref_id from the encrypted requestState, looks up the suspended Task
via Phantom.Tracker (works cross-node via Erlang's distributed pids),
forwards the client's response, and becomes the Task's new adopter.
The Task receives the response inline and execution continues; if it
calls await again the cycle repeats with the new adopter.

Under legacy protocols (and stdio), `await: true` is a no-op — the
existing inline blocking path already works because the SSE stream
holds open. Same source code under both protocols.

Architecture:
- Task uses :phantom_adopter and :phantom_tool_request_id in process
  dictionary as its mutable per-process state (current HTTP-handler
  to send next event to, current request id to respond on).
- Session.elicit(..., await: true) on stateless sends
  {:phantom_await_elicit, ref_id, elicit, task_pid, request_id} to
  the adopter, blocks for {:phantom_elicit_response, ref_id,
  response, new_adopter, new_request_id}.
- Phantom.Session.handle_info catches :phantom_await_elicit,
  encrypts {:__phantom_await__, ref_id} as the requestState, registers
  the task in Phantom.Tracker, and responds with inputRequired.
- Phantom.Router.maybe_decode_state detects the await tagged tuple
  and routes through adopt_pending_task which sends the response and
  hands the task off to the new session GenServer.

finalize_tool_result now reads adopter and request_id from process
dictionary on every send so the Task always replies to its current
HTTP handler, not the one that spawned it.

README and CHANGELOG cover the three elicitation patterns: inline
(elicit + await: true), explicit re-entry (request_input), and
direct (Tool.input_required).
Phantom.Session.elicit/3 now has two call patterns, both working on
both protocols:

  * `Session.elicit(session, elicit, await: true)` — inline blocking.
    Returns {:ok, response} | :not_supported | :timeout | :error.
    Under stateless, suspends the tool's Task and resumes inline when
    the follow-up tools/call arrives.

  * `Session.elicit(session, elicit, state: x)` (or no opts) — re-entry.
    Returns {:input_required, elicit, state, session}; the dispatcher
    converts to an inputRequired result (stateless) or runs the SSE
    elicit round-trip + handler re-invocation (legacy). Handler is
    re-entered with session.state populated.

Phantom.Session.request_input/3 is removed. Its behavior is now the
no-:await branch of elicit/3.

Breaking change: Phantom.Session.elicit/3 without :await previously
blocked inline under legacy. Existing callers expecting {:ok, response}
must add `await: true`. The test fixture's elicit_tool, async_elicit_tool,
and url_elicit_tool have been migrated; same with the router's internal
legacy elicit path used by the re-entry branch.

README, elicit.ex moduledoc, and CHANGELOG updated.
POST tools/call to node 1 invokes await_tool, which calls
Session.elicit(..., await: true) inline. The Task suspends on node 1;
node 1 returns inputRequired with an encrypted requestState pointing
at the suspended Task via a ref_id registered in Phantom.Tracker.

The follow-up POST to node 2 carries the requestState and the elicit
response data. Node 2's dispatcher decodes the ref_id, retries the
Tracker lookup until CRDT replication delivers the entry (via the
existing await_request_meta/3 helper used by the cross-node
elicitation routing), then sends the response message to the Task pid
— which Erlang's distributed runtime routes back to node 1. The Task
resumes inline, runs to completion, and casts its final result via
Session.respond to node 2's session GenServer, which writes the
result on the new HTTP response.

End-to-end proof that the inline-await pattern preserves the
distributed-by-default property — Phoenix.PubSub stays optional for
the encrypted-state path, and the Tracker-replicated path handles
the live-Task pointer cross-node automatically.

Adds await_tool to Test.MCP.Router exercising the new pattern.
Gap 1: session.client_info / client_capabilities under stateless

Under MCP 2026-07-28 there is no initialize call to populate
session.client_info or session.client_capabilities. The request's
_meta.clientInfo and _meta.capabilities now hydrate these fields
in Phantom.Plug.prepare_for_dispatch on every request, so devs
reading session.client_info["name"] or session.client_capabilities[:elicitation]
see the same shape they would on a legacy session. Under legacy the
session is already populated by inherit_session_meta — _meta on
subsequent requests is empty, so this is a no-op.

Gap 2: prompts/get didn't support the new elicit patterns

get_prompt ran the handler inline and didn't recognize {:input_required,
elicit, state, session} from Session.elicit. Refactored to spawn a Task
mirroring run_tool's pattern, with finalize_prompt_result handling all
the same shapes: {:reply, _, _}, {:input_required, _, _, _}, {:error,
_, _}, {:noreply, _}, {:elicitation_required, _}, plus the fallback.
maybe_decode_state and adopt_pending_task are now shared between
get_tool and get_prompt — same encrypted state contract under
2026-07-28, same legacy SSE round-trip under earlier protocols, same
re-invocation pattern after resume.

resources/read uses a sub-router architecture (router.call(fake_conn))
that's structurally different; that path remains inline-only and is a
documented limitation for future work.

Tests cover both gaps end-to-end: session.client_info populates from
_meta through a real Phantom.Plug pipeline, and prompts/get can elicit
input via session.state re-entry under stateless.
Per the elixir-reviewer feedback on the branch, collapse the
near-duplicate tool/prompt finalization machinery and centralize
the process-dictionary plumbing.

- Phantom.Router.run_handler/5 + finalize_result/5 replace
  run_tool/4, finalize_tool_result/4, run_prompt/4, and
  finalize_prompt_result/4. Kind-specific behavior lives in two
  tiny helpers: telemetry_method/1 and format_response/3. The
  byte-identical legacy-elicit re-invocation block now exists once.
- respond_to_caller/1 and respond_error_to_caller/1 contain all
  reads of :phantom_adopter / :phantom_tool_request_id. Documented
  why the process dictionary stays: stateless_await/3 must update
  these mid-Task across cross-node resume, so threading purely
  through args would force a Session.elicit return-type change.
- Phantom.Tool.input_required/2 (new arity) is the canonical
  builder for the inputRequired result map. Three former call
  sites (router stateless tool, router stateless prompt, Session
  handle_info :phantom_await_elicit) all use it now.
- Phantom.RequestState.encode/2 guards `byte_size >= 64` so the
  doc claim matches behavior; Phoenix-equivalent enforcement.

374 tests, 0 failures (incl. 8 clustered). 117 lines net removed
from router.ex.
Restore backward compatibility for existing callers of `Session.elicit/3`
that pattern-match on `{:ok, response}`. The protocol-aware default is:

  * stateless (2026-07-28) → re-entry (returns the tagged tuple)
  * legacy (≤ 2025-11-25)  → inline blocking (existing behavior)

Explicit `:await` or `:state` opts override the default and force a
specific mode regardless of protocol.

This removes the breaking-change footnote: legacy code that called
`Session.elicit(session, elicit)` (no opts) keeps working unchanged.
Code that needs to support stateless can either add `await: true` for
inline ergonomics or use `state:` for the re-entry pattern; either
works on both protocols.

Cleans up the test fixture by dropping the now-redundant `await: true`
opts that were threaded through elicit_tool, async_elicit_tool, and
url_elicit_tool during the earlier migration.
Fold:

- Delete lib/phantom/protocol_version.ex and its test. The supported
  versions list inlines on Phantom.Router as a module attribute; the
  `mode/1` ternary collapses to a `Phantom.Session.stateless?/1`
  predicate now exposed publicly (was private).
- Two callers updated: Phantom.Router.finalize_result and the
  Session.elicit cond.

Secret-key-base safety:

- Phantom.Router.__after_verify__ now raises if :secret_key_base is
  present but < 64 bytes (matches Plug.Crypto's expectation; a short
  key degrades the security guarantee of the requestState blob).
- IO.warn at compile time if :secret_key_base is missing AND the
  router has tools or prompts — devs see a build-time pointer to
  the missing config before a 2026-07-28 client hits the server and
  gets a runtime error. The warning is silenced when Mix.env() == :test
  so Phantom's own test fixtures (and downstream tests) don't spam.

Upgrade guide added to CHANGELOG covering both the new :secret_key_base
config and the protocol-aware default behavior of Session.elicit/3.
Phantom.Request.trace_context/1 was framed as a public helper, but devs
don't actually call it — Phantom uses it to populate metadata.trace_context
on the [:phantom, :dispatch] telemetry span, and tracer libraries
(OpenTelemetry et al.) attach to that event and read the field. Marked
the function @doc false to reflect its real role.

Added a "Distributed tracing" section to the README with the OpenTelemetry
wiring snippet, and updated the CHANGELOG entry to say "W3C trace context
is automatically surfaced" rather than implying a function dev should call.
The function parsed the three MCP 2026-07-28 routing headers
(mcp-protocol-version, mcp-method, mcp-name) and stored them on
conn.private.phantom.transport, but nothing in Phantom ever read
that field — pure plumbing without a consumer.

These headers exist for *upstream* infrastructure (load balancers,
WAFs, API gateways) to route on without inspecting the JSON-RPC body.
That routing happens before the request reaches Phantom; the server
itself doesn't need to do anything with the headers.

If we ever want to surface them in telemetry or cross-check against
the parsed body, that can land as a deliberate feature.
Split into two clear audiences:

* For existing users — what's unchanged (legacy-only deployments need
  nothing) and what to add for modern clients (:secret_key_base plus
  `await: true` on existing Session.elicit calls). Concrete before/after
  snippets.

* Recommendation for new PhantomMCP users — use the re-entry pattern
  (`state:` opt + function-head clauses matching on `%Session{state:
  ...}`) instead of `await: true`. It's stateless on the wire (no
  suspended Task, no Tracker for cross-node), pattern-matches cleanly,
  and reads better for multi-step flows. Worked example with a
  delete_file tool.
Covers the deployment shapes for the two protocol versions:

* Legacy clients (≤ 2025-11-25) — sticky-session by `mcp-session-id`
  header, or run Phantom.Tracker + Phoenix.PubSub for cross-node
  internal routing. Nginx and HAProxy snippets for the LB.

* Stateless clients (2026-07-28) — plain round-robin, no sticky
  session. The encrypted requestState blob carries continuation state
  across requests; any node that shares :secret_key_base can serve any
  request.

L7 routing patterns using the new MCP 2026-07-28 routing headers
(mcp-protocol-version / mcp-method / mcp-name):

* Split legacy and stateless traffic onto separate pools
* Per-method rate limiting
* Per-tool routing to specialised workers

Response caching notes: Phantom emits ttlMs/cacheScope in the JSON-RPC
result body for MCP-aware clients but doesn't translate to HTTP
Cache-Control headers automatically. Includes the public-vs-private
risk table and the warning about marking user-specific responses as
public.

Plus origin-validation and health-check sections.
The three nginx patterns under 'Useful patterns' (split traffic by
protocol version, per-method rate limiting, per-tool routing) were
speculative — not validated against a real deployment. Keep the
header list and the fact that Phantom doesn't read them server-side;
drop the unverified examples.
with_cache/2 builds a small two-key map with possibly-nil values and
rejects nils before merging — same outcome as the put_if_set pipeline
but reuses the existing helper that's already in scope (request.ex
already imports Phantom.Utils).
Phantom.RequestState no longer hides the HKDF salt behind a fixed
'phantom_request_state' constant. Both arguments are required and
passed explicitly:

  Phantom.RequestState.encode(term, secret_key_base, salt)
  Phantom.RequestState.decode(token, secret_key_base, salt, opts)

Phantom.Router accepts :request_state_salt alongside :secret_key_base
and surfaces it on __phantom__(:info). The compile-time validation
now refuses to compile when one is set without the other (both have to
be configured together for stateless support).

Per the configuration story: secret_key_base must be ≥ 64 bytes of
high-entropy material; the salt is a stable string of the dev's choice
that scopes the derived key to requestState use. Rotating the salt
invalidates all in-flight blobs.

Callsites in Phantom.Router.encode_request_state/maybe_decode_state and
Phantom.Session.handle_info(:phantom_await_elicit) now read both values
from __phantom__(:info) and pass them through. CHANGELOG and README
upgrade guides explain both knobs.
Phantom.Session.elicit/3 (no :await) now returns the session struct
with a :pending_elicit annotation. The handler wraps it in the
standard {:noreply, session} reply shape:

    def my_tool(params, session) do
      {:noreply, Session.elicit(session, elicit, state: %{step: :got_x})}
    end

    def my_tool(%{...}, %Session{state: %{step: :got_x}} = session) do
      {:reply, Tool.text("..."), session}
    end

Under the hood Phantom routes through the same Task-suspension
mechanism that powers `await: true`:

* The handler returns {:noreply, %Session{pending_elicit: {elicit, state}}}
* `process_handler_result/5` in the Task calls Session.elicit(..., await: true)
* The Task suspends on the originating node (registered in Phantom.Tracker)
* On resume, the response merges into params, session.state is updated,
  and the handler is re-applied — state accumulates across multiple
  elicit steps.

Trade-off: gives up the "any node can serve any call" property under
stateless. The Task lives on the originating node; cross-node follow-ups
route to it via Phantom.Tracker (Erlang's distributed-pid runtime
delivers the message). The motivation here is that MCP conversation
state is naturally a LiveView-like stateful process, not a stateless
controller — accumulating state on the live session is the dev-facing
mental model we want.

The encrypted requestState blob carries only the task pointer now, not
user state. Phantom.Tool.input_required/2 remains as the lower-level
escape hatch for devs who want to construct an inputRequired result
without Task suspension (returning {:reply, Tool.input_required(...),
session}).
* Drop the dead `is_function(session.elicit) and self() == session.pid`
  fast path in Session.do_elicit/3. Under always-Task-mode the dispatcher
  always spawns, so self() inside the Task never matches session.pid.
  The cross-process GenServer.call path is the only reachable one now.
* Suppress the missing-:secret_key_base warning only when Phantom itself
  is being tested (`Mix.Project.config()[:app] == :phantom_mcp`) rather
  than when any `:test` env is active — downstream consumers running
  their own test suites now see the warning.
* Document the side effect in Session.stateless_await/3: the receive
  rebinds :phantom_adopter and :phantom_tool_request_id in the process
  dictionary on cross-node resume. Tighten its raise message to mention
  prompt handlers too.
* Expand the doc comment on Phantom.Router.process_handler_result/5 to
  explain the stateless-core mechanic — Session.elicit(await: true)
  suspends the Task while the adopter emits the inputRequired response;
  resume happens via Phantom.Tracker lookup on the follow-up call.
* Split the catchall in process_handler_result/5 into an explicit
  `:error` clause and an `other` clause that surfaces unexpected returns
  via inspect/1 for debuggability.
* Add a direct unit test for `{:noreply, Session.elicit(..., state: x)}`
  going through process_handler_result/5 — asserts the dispatcher sends
  :phantom_await_elicit, manually adopts the task via Phantom.Tracker,
  dispatches the follow-up, and verifies session.state and merged params
  reach the re-applied handler.
@dbernheisel dbernheisel force-pushed the dbern/mcp-2026-07-28-stateless-core branch from e62db51 to 60c0208 Compare June 5, 2026 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant