feat: MCP 2026-07-28 stateless-core support#23
Draft
dbernheisel wants to merge 27 commits into
Draft
Conversation
Additive groundwork for the new stateless-core dispatch mode; no existing behavior is changed yet. Subsequent work wires the inputRequired flow through the dispatcher. - Phantom.ProtocolVersion: single source of truth for supported versions plus mode/1 (:legacy | :stateless_core | :unsupported) - Phantom.Request gains a :meta field carrying params._meta (protocolVersion, clientInfo, capabilities, traceparent, tracestate, baggage, requestState) - Phantom.Plug.read_transport_headers/1 captures mcp-protocol-version / mcp-method / mcp-name onto conn.private.phantom.transport - Phantom.RequestState: Plug.Crypto-backed encode/decode for the opaque requestState continuation blob (no Phoenix dep) - Phantom.Router accepts :secret_key_base, surfaced on __phantom__(:info) - Phantom.Tool.input_required/1 builder + Tool.response/1 passthrough for the inputRequired shape
Session.elicit/3 now accepts a `state:` option. When provided, the call
returns `{:input_required, elicit, state, session}` instead of blocking.
Phantom.Router catches that tuple in run_tool/4 and dispatches per
protocol:
- 2026-07-28: builds a Tool.input_required result, encrypts the state
via Phantom.RequestState, returns it on the same POST. The next call
decodes the blob onto `session.state` so the handler resumes via a
pattern match in its function head.
- Legacy: performs the existing SSE elicitation/create round-trip,
merges the response into params, sets `session.state`, and
re-invokes the handler. Multi-step elicits recurse naturally.
Without `state:`, Session.elicit/3 keeps its existing blocking
behavior — no change for legacy callers. The dev's handler is
identical across protocols; no `if protocol_version == ...` and no
`throw`/`catch` for control flow.
Adds Phantom.Elicit.to_input_requests/1, renames the Tool.input_required
option from `:request_state` to `:state` for symmetry with the new
`session.state` field, and adds protocol-agnostic stateless_core_test
coverage.
Phantom.Request.with_cache/2 — annotates any JSON-RPC result with top-level ttlMs and cacheScope per the RC spec. These are advisory hints for the client and any intermediate CDN/gateway; the server does not cache. Works on tool, resource, prompt, list, and completion results alike. Phantom.Request.trace_context/1 — extracts W3C traceparent / tracestate / baggage from _meta and exposes them as a normalized map. The [:phantom, :dispatch] telemetry span carries this on its metadata so OpenTelemetry consumers can continue an upstream trace through Phantom without extra wiring.
Two Bandit nodes share only :secret_key_base. Node 1 receives a tools/call and returns inputRequired with the encrypted requestState blob; node 2 (no shared memory, no Tracker entry, no session header) decodes the blob and runs the same tool's resume clause to completion. This locks in the distributed-by-default claim for the new stateless core. Adds a resume_tool to Test.MCP.Router that demonstrates the protocol- agnostic Session.elicit(..., state: ...) pattern, plus configures :secret_key_base for the test fixture.
CHANGELOG gets an Unreleased section enumerating the new public surface (Phantom.ProtocolVersion, Phantom.RequestState, the :secret_key_base option, Session.elicit/3 with :state, Tool.input_required/1, Request.with_cache/2, Request.trace_context/1, Plug.read_transport_headers/1). README gains an "Eliciting input" section with the protocol- agnostic Session.elicit/3 pattern and pattern-matched resume clause. The existing "Persistent Streams" section now flags that it applies to legacy protocols only, and the supported-methods list notes that elicitation/create's SSE round-trip is replaced by inputRequired + encrypted requestState under 2026-07-28. Phantom.Router, Phantom.Session.elicit/3, Phantom.Elicit, and Phantom.Plug moduledocs surface the dual-mode behavior and the new options/return shapes.
Existing Session.elicit/3 reverts to its original inline-blocking
behavior. Under MCP 2026-07-28 (where there is no persistent stream
to block on) the function raises with a clear pointer to the new
helper.
Phantom.Session.request_input/3 is the explicit re-entry helper.
Returns {:input_required, elicit, state, session}; the dispatcher
converts that to a Tool.input_required result under stateless or
performs the SSE round-trip + handler re-invocation under legacy.
Updates README and module docs to show both patterns side-by-side
and explain when to pick each. Tests, fixture router, and CHANGELOG
follow the new naming.
A follow-up Phase B will add Session.elicit(..., await: true) — a
protocol-agnostic inline form backed by Task-based continuation
that survives across HTTP requests under stateless.
Tools/call handlers now always run in a spawned Task, using the
existing {:noreply, session} + Session.respond async pattern.
Lays the foundation for Phase B-2, where Session.elicit(...,
await: true) will use a message protocol between the task and
its current HTTP-handler adopter for protocol-agnostic inline
awaiting.
Behavior changes:
- Tool crashes are isolated to the Task. The HTTP/session process
no longer terminates; the client receives a JSON-RPC -32603
error and the session keeps serving. The crash is surfaced via
the [:phantom, :dispatch, :exception] telemetry event with
metadata {kind, reason, stacktrace, method, params, request,
session}.
- Session.respond_error/3 is a new public API to send a JSON-RPC
error response from an async task. Internally it casts the same
{:respond, ...} message used by Session.respond/2; the payload
shape distinguishes success (result key) from error (error key).
- Phantom.Plug's existing legacy elicit closure now serves the
task via the cross-process pattern. Existing async_elicit_tool-
style handlers continue to work via GenServer.call to the
idle session GenServer.
Tests previously asserting {:EXIT, pid, exception} for crashed
tools updated to attach a telemetry handler and assert the
[:phantom, :dispatch, :exception] event. assert_exception_response
helper is unchanged for callers that still want the old contract,
but the project's own tests no longer rely on it.
The stateless-core unit tests use a small assert_responded/1
helper that pattern-matches on the {:"$gen_cast", {:respond, ...}}
message the Task sends to session.pid (the test process).
Under MCP 2026-07-28 (stateless core), Session.elicit/3 with `await:
true` suspends the tool's Task and returns an inputRequired result
to the client. The follow-up tools/call request — possibly on a
different node behind a plain round-robin LB — decodes an opaque
ref_id from the encrypted requestState, looks up the suspended Task
via Phantom.Tracker (works cross-node via Erlang's distributed pids),
forwards the client's response, and becomes the Task's new adopter.
The Task receives the response inline and execution continues; if it
calls await again the cycle repeats with the new adopter.
Under legacy protocols (and stdio), `await: true` is a no-op — the
existing inline blocking path already works because the SSE stream
holds open. Same source code under both protocols.
Architecture:
- Task uses :phantom_adopter and :phantom_tool_request_id in process
dictionary as its mutable per-process state (current HTTP-handler
to send next event to, current request id to respond on).
- Session.elicit(..., await: true) on stateless sends
{:phantom_await_elicit, ref_id, elicit, task_pid, request_id} to
the adopter, blocks for {:phantom_elicit_response, ref_id,
response, new_adopter, new_request_id}.
- Phantom.Session.handle_info catches :phantom_await_elicit,
encrypts {:__phantom_await__, ref_id} as the requestState, registers
the task in Phantom.Tracker, and responds with inputRequired.
- Phantom.Router.maybe_decode_state detects the await tagged tuple
and routes through adopt_pending_task which sends the response and
hands the task off to the new session GenServer.
finalize_tool_result now reads adopter and request_id from process
dictionary on every send so the Task always replies to its current
HTTP handler, not the one that spawned it.
README and CHANGELOG cover the three elicitation patterns: inline
(elicit + await: true), explicit re-entry (request_input), and
direct (Tool.input_required).
Phantom.Session.elicit/3 now has two call patterns, both working on
both protocols:
* `Session.elicit(session, elicit, await: true)` — inline blocking.
Returns {:ok, response} | :not_supported | :timeout | :error.
Under stateless, suspends the tool's Task and resumes inline when
the follow-up tools/call arrives.
* `Session.elicit(session, elicit, state: x)` (or no opts) — re-entry.
Returns {:input_required, elicit, state, session}; the dispatcher
converts to an inputRequired result (stateless) or runs the SSE
elicit round-trip + handler re-invocation (legacy). Handler is
re-entered with session.state populated.
Phantom.Session.request_input/3 is removed. Its behavior is now the
no-:await branch of elicit/3.
Breaking change: Phantom.Session.elicit/3 without :await previously
blocked inline under legacy. Existing callers expecting {:ok, response}
must add `await: true`. The test fixture's elicit_tool, async_elicit_tool,
and url_elicit_tool have been migrated; same with the router's internal
legacy elicit path used by the re-entry branch.
README, elicit.ex moduledoc, and CHANGELOG updated.
POST tools/call to node 1 invokes await_tool, which calls Session.elicit(..., await: true) inline. The Task suspends on node 1; node 1 returns inputRequired with an encrypted requestState pointing at the suspended Task via a ref_id registered in Phantom.Tracker. The follow-up POST to node 2 carries the requestState and the elicit response data. Node 2's dispatcher decodes the ref_id, retries the Tracker lookup until CRDT replication delivers the entry (via the existing await_request_meta/3 helper used by the cross-node elicitation routing), then sends the response message to the Task pid — which Erlang's distributed runtime routes back to node 1. The Task resumes inline, runs to completion, and casts its final result via Session.respond to node 2's session GenServer, which writes the result on the new HTTP response. End-to-end proof that the inline-await pattern preserves the distributed-by-default property — Phoenix.PubSub stays optional for the encrypted-state path, and the Tracker-replicated path handles the live-Task pointer cross-node automatically. Adds await_tool to Test.MCP.Router exercising the new pattern.
Gap 1: session.client_info / client_capabilities under stateless
Under MCP 2026-07-28 there is no initialize call to populate
session.client_info or session.client_capabilities. The request's
_meta.clientInfo and _meta.capabilities now hydrate these fields
in Phantom.Plug.prepare_for_dispatch on every request, so devs
reading session.client_info["name"] or session.client_capabilities[:elicitation]
see the same shape they would on a legacy session. Under legacy the
session is already populated by inherit_session_meta — _meta on
subsequent requests is empty, so this is a no-op.
Gap 2: prompts/get didn't support the new elicit patterns
get_prompt ran the handler inline and didn't recognize {:input_required,
elicit, state, session} from Session.elicit. Refactored to spawn a Task
mirroring run_tool's pattern, with finalize_prompt_result handling all
the same shapes: {:reply, _, _}, {:input_required, _, _, _}, {:error,
_, _}, {:noreply, _}, {:elicitation_required, _}, plus the fallback.
maybe_decode_state and adopt_pending_task are now shared between
get_tool and get_prompt — same encrypted state contract under
2026-07-28, same legacy SSE round-trip under earlier protocols, same
re-invocation pattern after resume.
resources/read uses a sub-router architecture (router.call(fake_conn))
that's structurally different; that path remains inline-only and is a
documented limitation for future work.
Tests cover both gaps end-to-end: session.client_info populates from
_meta through a real Phantom.Plug pipeline, and prompts/get can elicit
input via session.state re-entry under stateless.
Per the elixir-reviewer feedback on the branch, collapse the near-duplicate tool/prompt finalization machinery and centralize the process-dictionary plumbing. - Phantom.Router.run_handler/5 + finalize_result/5 replace run_tool/4, finalize_tool_result/4, run_prompt/4, and finalize_prompt_result/4. Kind-specific behavior lives in two tiny helpers: telemetry_method/1 and format_response/3. The byte-identical legacy-elicit re-invocation block now exists once. - respond_to_caller/1 and respond_error_to_caller/1 contain all reads of :phantom_adopter / :phantom_tool_request_id. Documented why the process dictionary stays: stateless_await/3 must update these mid-Task across cross-node resume, so threading purely through args would force a Session.elicit return-type change. - Phantom.Tool.input_required/2 (new arity) is the canonical builder for the inputRequired result map. Three former call sites (router stateless tool, router stateless prompt, Session handle_info :phantom_await_elicit) all use it now. - Phantom.RequestState.encode/2 guards `byte_size >= 64` so the doc claim matches behavior; Phoenix-equivalent enforcement. 374 tests, 0 failures (incl. 8 clustered). 117 lines net removed from router.ex.
Restore backward compatibility for existing callers of `Session.elicit/3`
that pattern-match on `{:ok, response}`. The protocol-aware default is:
* stateless (2026-07-28) → re-entry (returns the tagged tuple)
* legacy (≤ 2025-11-25) → inline blocking (existing behavior)
Explicit `:await` or `:state` opts override the default and force a
specific mode regardless of protocol.
This removes the breaking-change footnote: legacy code that called
`Session.elicit(session, elicit)` (no opts) keeps working unchanged.
Code that needs to support stateless can either add `await: true` for
inline ergonomics or use `state:` for the re-entry pattern; either
works on both protocols.
Cleans up the test fixture by dropping the now-redundant `await: true`
opts that were threaded through elicit_tool, async_elicit_tool, and
url_elicit_tool during the earlier migration.
Fold: - Delete lib/phantom/protocol_version.ex and its test. The supported versions list inlines on Phantom.Router as a module attribute; the `mode/1` ternary collapses to a `Phantom.Session.stateless?/1` predicate now exposed publicly (was private). - Two callers updated: Phantom.Router.finalize_result and the Session.elicit cond. Secret-key-base safety: - Phantom.Router.__after_verify__ now raises if :secret_key_base is present but < 64 bytes (matches Plug.Crypto's expectation; a short key degrades the security guarantee of the requestState blob). - IO.warn at compile time if :secret_key_base is missing AND the router has tools or prompts — devs see a build-time pointer to the missing config before a 2026-07-28 client hits the server and gets a runtime error. The warning is silenced when Mix.env() == :test so Phantom's own test fixtures (and downstream tests) don't spam. Upgrade guide added to CHANGELOG covering both the new :secret_key_base config and the protocol-aware default behavior of Session.elicit/3.
Phantom.Request.trace_context/1 was framed as a public helper, but devs don't actually call it — Phantom uses it to populate metadata.trace_context on the [:phantom, :dispatch] telemetry span, and tracer libraries (OpenTelemetry et al.) attach to that event and read the field. Marked the function @doc false to reflect its real role. Added a "Distributed tracing" section to the README with the OpenTelemetry wiring snippet, and updated the CHANGELOG entry to say "W3C trace context is automatically surfaced" rather than implying a function dev should call.
The function parsed the three MCP 2026-07-28 routing headers (mcp-protocol-version, mcp-method, mcp-name) and stored them on conn.private.phantom.transport, but nothing in Phantom ever read that field — pure plumbing without a consumer. These headers exist for *upstream* infrastructure (load balancers, WAFs, API gateways) to route on without inspecting the JSON-RPC body. That routing happens before the request reaches Phantom; the server itself doesn't need to do anything with the headers. If we ever want to surface them in telemetry or cross-check against the parsed body, that can land as a deliberate feature.
Split into two clear audiences:
* For existing users — what's unchanged (legacy-only deployments need
nothing) and what to add for modern clients (:secret_key_base plus
`await: true` on existing Session.elicit calls). Concrete before/after
snippets.
* Recommendation for new PhantomMCP users — use the re-entry pattern
(`state:` opt + function-head clauses matching on `%Session{state:
...}`) instead of `await: true`. It's stateless on the wire (no
suspended Task, no Tracker for cross-node), pattern-matches cleanly,
and reads better for multi-step flows. Worked example with a
delete_file tool.
Covers the deployment shapes for the two protocol versions: * Legacy clients (≤ 2025-11-25) — sticky-session by `mcp-session-id` header, or run Phantom.Tracker + Phoenix.PubSub for cross-node internal routing. Nginx and HAProxy snippets for the LB. * Stateless clients (2026-07-28) — plain round-robin, no sticky session. The encrypted requestState blob carries continuation state across requests; any node that shares :secret_key_base can serve any request. L7 routing patterns using the new MCP 2026-07-28 routing headers (mcp-protocol-version / mcp-method / mcp-name): * Split legacy and stateless traffic onto separate pools * Per-method rate limiting * Per-tool routing to specialised workers Response caching notes: Phantom emits ttlMs/cacheScope in the JSON-RPC result body for MCP-aware clients but doesn't translate to HTTP Cache-Control headers automatically. Includes the public-vs-private risk table and the warning about marking user-specific responses as public. Plus origin-validation and health-check sections.
The three nginx patterns under 'Useful patterns' (split traffic by protocol version, per-method rate limiting, per-tool routing) were speculative — not validated against a real deployment. Keep the header list and the fact that Phantom doesn't read them server-side; drop the unverified examples.
with_cache/2 builds a small two-key map with possibly-nil values and rejects nils before merging — same outcome as the put_if_set pipeline but reuses the existing helper that's already in scope (request.ex already imports Phantom.Utils).
Phantom.RequestState no longer hides the HKDF salt behind a fixed 'phantom_request_state' constant. Both arguments are required and passed explicitly: Phantom.RequestState.encode(term, secret_key_base, salt) Phantom.RequestState.decode(token, secret_key_base, salt, opts) Phantom.Router accepts :request_state_salt alongside :secret_key_base and surfaces it on __phantom__(:info). The compile-time validation now refuses to compile when one is set without the other (both have to be configured together for stateless support). Per the configuration story: secret_key_base must be ≥ 64 bytes of high-entropy material; the salt is a stable string of the dev's choice that scopes the derived key to requestState use. Rotating the salt invalidates all in-flight blobs. Callsites in Phantom.Router.encode_request_state/maybe_decode_state and Phantom.Session.handle_info(:phantom_await_elicit) now read both values from __phantom__(:info) and pass them through. CHANGELOG and README upgrade guides explain both knobs.
Phantom.Session.elicit/3 (no :await) now returns the session struct
with a :pending_elicit annotation. The handler wraps it in the
standard {:noreply, session} reply shape:
def my_tool(params, session) do
{:noreply, Session.elicit(session, elicit, state: %{step: :got_x})}
end
def my_tool(%{...}, %Session{state: %{step: :got_x}} = session) do
{:reply, Tool.text("..."), session}
end
Under the hood Phantom routes through the same Task-suspension
mechanism that powers `await: true`:
* The handler returns {:noreply, %Session{pending_elicit: {elicit, state}}}
* `process_handler_result/5` in the Task calls Session.elicit(..., await: true)
* The Task suspends on the originating node (registered in Phantom.Tracker)
* On resume, the response merges into params, session.state is updated,
and the handler is re-applied — state accumulates across multiple
elicit steps.
Trade-off: gives up the "any node can serve any call" property under
stateless. The Task lives on the originating node; cross-node follow-ups
route to it via Phantom.Tracker (Erlang's distributed-pid runtime
delivers the message). The motivation here is that MCP conversation
state is naturally a LiveView-like stateful process, not a stateless
controller — accumulating state on the live session is the dev-facing
mental model we want.
The encrypted requestState blob carries only the task pointer now, not
user state. Phantom.Tool.input_required/2 remains as the lower-level
escape hatch for devs who want to construct an inputRequired result
without Task suspension (returning {:reply, Tool.input_required(...),
session}).
* Drop the dead `is_function(session.elicit) and self() == session.pid`
fast path in Session.do_elicit/3. Under always-Task-mode the dispatcher
always spawns, so self() inside the Task never matches session.pid.
The cross-process GenServer.call path is the only reachable one now.
* Suppress the missing-:secret_key_base warning only when Phantom itself
is being tested (`Mix.Project.config()[:app] == :phantom_mcp`) rather
than when any `:test` env is active — downstream consumers running
their own test suites now see the warning.
* Document the side effect in Session.stateless_await/3: the receive
rebinds :phantom_adopter and :phantom_tool_request_id in the process
dictionary on cross-node resume. Tighten its raise message to mention
prompt handlers too.
* Expand the doc comment on Phantom.Router.process_handler_result/5 to
explain the stateless-core mechanic — Session.elicit(await: true)
suspends the Task while the adopter emits the inputRequired response;
resume happens via Phantom.Tracker lookup on the follow-up call.
* Split the catchall in process_handler_result/5 into an explicit
`:error` clause and an `other` clause that surfaces unexpected returns
via inspect/1 for debuggability.
* Add a direct unit test for `{:noreply, Session.elicit(..., state: x)}`
going through process_handler_result/5 — asserts the dispatcher sends
:phantom_await_elicit, manually adopts the task via Phantom.Tracker,
dispatches the follow-up, and verifies session.state and merged params
reach the re-applied handler.
e62db51 to
60c0208
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Initial support for the MCP
2026-07-28"stateless core" protocol while keeping all earlier protocol versions working unchanged. Same handler source code runs on both: noif protocol_versionbranches anywhere in user-land, and existing legacy callers don't need to change anything.The dev's only adoption step for supporting the new protocol is adding
:secret_key_baseto their router — used byPhantom.RequestStateto encrypt the multi-round-triprequestStateblob withPlug.CryptoWhat's new
Phantom.RequestState—Plug.Crypto-backed encode/decode for the opaquerequestStatecontinuationPhantom.Tool.input_required/1/2— result builder for the newinputRequiredshapePhantom.Request.with_cache/2—ttlMs/cacheScopecache hints per the RCPhantom.Request.trace_context/1— W3Ctraceparent/tracestate/baggageextracted from_meta; surfaces on the[:phantom, :dispatch]telemetry spanPhantom.Plug.read_transport_headers/1—mcp-protocol-version/mcp-method/mcp-namefor L7 routingSession.elicit/3— protocol-aware defaultsPhantom.Session.elicit/3has two modes, picked by opts with protocol-aware defaults:For earlier MCP clients, they default to
await: true, andawait: falseis will be ignored.For newer MCP clients, they default to
await: false.Session.elicit(session, elicitation)state: nilSession.elicit(session, elicitation, await: true)Session.elicit(session, elicitation, state: x){:ok, response}. Under legacy, blocks via the open SSE stream. Under stateless, suspends the tool's Task and resumes inline when the follow-uptools/callarrives (possibly on a different node).{:input_required, elicit, state, session}. The dispatcher converts to aninputRequiredresult (stateless) or runs the SSE elicit round-trip + handler re-invocation (legacy). The handler is re-entered withsession.statepopulated.Existing legacy code that pattern-matches
{:ok, response}againstSession.elicit(session, elicit)keeps working unchanged because legacy's no-opts default is still inline blocking. Code that wants to support stateless inline either picks the mode explicitly withawait: trueor migrates to the re-entry pattern withstate:.Other dev-facing changes
Phantom.Routeraccepts:secret_key_base(required for2026-07-28support)session.client_infoandsession.client_capabilitiespopulate from_metaper-request (noinitializehandshake exists under the new protocol)[:phantom, :dispatch, :exception]telemetry event with{kind, reason, stacktrace, method, params, request, session}metadataDistribution
Both elicitation patterns preserve "any node can serve any call":
requestState; no cross-node state at allPhantom.Trackerand Erlang's distributed-pid runtime delivers the message cross-node.