Skip to content

perf(api): take member email from the WorkOS token claim, skip the user fetch (AP-256)#403

Merged
isuttell merged 1 commit into
mainfrom
ap-256-email-claim-fast-path
Jun 6, 2026
Merged

perf(api): take member email from the WorkOS token claim, skip the user fetch (AP-256)#403
isuttell merged 1 commit into
mainfrom
ap-256-email-claim-fast-path

Conversation

@isuttell

@isuttell isuttell commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

What

The dashboard WorkOS client's JWT Template now emits a zaks-io:email claim. When a verified token carries it, resolveWorkOsIdentity takes the email straight from the claim (the verified sub is the authoritative user id) and skips the per-request GET /user_management/users/{id} WorkOS call (~100ms p50 on every authed request).

CLI and MCP tokens come from separate WorkOS clients without the template, so they fall back to the existing user fetch — which still guards user_id_mismatch. The existing tests for that path are unchanged and green.

Why

AP-256 began as "authed dashboard navigation feels slow (1–3s)." Measured against Axiom (cloudflare otel.traces): the ~1–2.7s is cold isolate + cold Hyperdrive→Postgres connection warmup, identical on preview and production, because neither has continuous traffic pre-launch (warm isolates serve the same routes in 1–171ms). That resolves with real traffic and has no infra knob (Neon autosuspend already off, Hyperdrive caching already on). The one real, always-present, code-fixable cost was the ~100ms WorkOS user-fetch — which this removes for dashboard requests.

Scope (deliberately minimal)

  • Auth correctness: the claim is read strictly after verifyWorkOsAccessToken (RS256 sig + issuer + exp + client_id). The dropped user_id_mismatch check is safe on the claim path — email and sub come from the same verified token, and the downstream authorization join key is workos_user_id (= sub), not email.
  • Authorization unchanged: member scopes stay in our DB, never the token. Operator status stays the WorkOS role claim. We do not use WorkOS permissions or organizations.
  • Documents the roles/scopes/permissions distinction at the Scope enum and in ADR 0082 (Accepted), which also records the cold-start latency finding so the next person doesn't re-chase it.
  • We considered also minting workspace_id into WorkOS metadata to skip the authz DB lookup, and dropped it (noise vs the warmup floor, would add the first WorkOS write call + a backfill). See the ADR's "Considered and dropped".

Verification

  • pnpm verify green (96 turbo tasks). pnpm test:coverage green, above the ratcheted floors (88/82/88/88).
  • New auth tests: claim present → no user fetch; claim absent → fetch fallback; claim present-but-empty → fetch fallback.
  • Reviewed by the code-reviewer agent: no auth-correctness or bypass findings.
  • Post-merge manual check: confirm via Axiom that dashboard requests no longer show a client span to api.workos.com GET /user_management/users, while CLI/MCP still do.

🤖 Generated with Claude Code

…er fetch (AP-256)

The dashboard WorkOS client's JWT Template now emits a `zaks-io:email` claim.
When a verified token carries it, resolveWorkOsIdentity takes the email straight
from the claim (the verified `sub` is the authoritative user id) and skips the
per-request `GET /user_management/users/{id}` WorkOS call (~100ms p50 on every
authed request). CLI and MCP tokens have no such template and fall back to the
existing user fetch, which still guards user_id_mismatch.

Authorization (member scopes) is unchanged: it stays in our database, never the
token. Operator status stays the WorkOS role claim. Documents the roles vs scopes
vs permissions distinction at the Scope enum and in ADR 0082, which also records
the AP-256 latency finding: the ~1-2.7s on authed routes is cold isolate + cold
Hyperdrive connection warmup (identical preview/prod, resolves with traffic), not
the auth path -- Neon autosuspend is off and Hyperdrive caching is on, so there
is no infra knob to turn.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@linear-code

linear-code Bot commented Jun 6, 2026

Copy link
Copy Markdown
AP-256 Cut authed route loader latency (~1s blank wait on sidebar navigation)

Problem

Clicking a sidebar page in apps/web blocks for ~1s before the new page renders, because each authed route's TanStack loader awaits one or more API calls before the route is allowed to paint. AP-255 adds a loading indicator so the wait feels responsive; this ticket reduces the actual latency so the wait is short or gone.

All file:line references below are from a read-only pass over apps/web/src.

Candidate levers (ranked by expected impact)

1. Every authed navigation does a blocking POST to /v1/auth/web/callback — investigate first

apps/web/src/routes/_authed.tsx:14-19 runs loadAuthedSessionFn() on every child route. That loader (apps/web/src/server/web-loaders.ts:50-64) does await apiFetchOrEmpty("/v1/auth/web/callback", { method: "POST", ... }) on every entry — so dashboard → artifacts → keys each re-POSTs. This is the layout loader that gates all authed pages, so its latency is paid on every single navigation. Action: determine whether this callback is idempotent/read-only. If yes, move it to ensureQueryData(authedSessionQuery()) with a multi-minute staleTime so it's skipped on revisit. If it's genuinely side-effectful (session refresh), document why and explore making it non-blocking. This is likely the largest single win — verify with a network trace before assuming.

2. Dashboard over-fetches 100 artifacts to render 6

apps/web/src/server/web-loaders.ts:66-78 fetches /v1/web/artifacts?limit=${COUNT_LIMIT} with COUNT_LIMIT = 100 (:27), but the dashboard renders only RECENT_LIMIT = 6 (apps/web/src/routes/_authed.dashboard.tsx:14, .slice(0, RECENT_LIMIT)). The /artifacts list page fetches its own full list separately, so lowering the dashboard limit to ~6–10 doesn't break pagination. Action: lower the dashboard loader's artifact limit to match what's shown above the fold.

3. No per-query staleTime → repeat navigations refetch

Global default is staleTime: 10_000 (apps/web/src/router.tsx:12) and defaultPreloadStaleTime: 0 (:27). Query factories in apps/web/src/lib/queries.ts:34-93 (dashboardQuery, artifactsQuery, auditQuery, keysQuery, settingsQuery, billingQuery, …) set no staleTime, so revisiting a page inside the window still triggers a background refetch and a cold revisit blocks. Action: add per-query staleTime overrides for stable data (audit/keys/settings/billing on the order of minutes; artifacts shorter). Tune to acceptable staleness; window-focus refetch still catches updates.

4. access-links route bypasses the query cache entirely

apps/web/src/routes/_authed.access-links.tsx:11-12 calls listAccessLinksFn() directly in the loader (not ensureQueryData) and reads via Route.useLoaderData(), so every visit refetches regardless of staleTime. Action: migrate to ensureQueryData(accessLinksQuery()) + useSuspenseQuery for cache + invalidation parity with the other pages.

5. activateBillingReturn awaits serially

apps/web/src/server/web-loaders.ts:164-176: status is awaited before invoices starts. The normal loadBilling() (:150) already uses Promise.all. Action: Promise.all the two fetches here too — saves ~500ms–1s on the Checkout return redirect.

6. Defer the artifact-detail revisions query

apps/web/src/routes/_authed.artifacts.$artifactId.tsx:23-32 blocks the route on three parallel queries including artifactRevisionsQuery, but revisions are a secondary dropdown interaction, not above-the-fold. Action: defer() the revisions query and wrap that section in <Suspense>/<Await> so the viewer paints sooner.

7. (Profile-gated) per-request auth memoization

getServerAuth() (apps/web/src/server/authkit.ts:19-35) is re-invoked by each loader in a request. Likely <10ms each but compounds across 3 loaders. Action: only if a profile shows >~50ms total, memoize within request scope. Don't add the seam speculatively.

Approach

Start with a network trace of a single sidebar navigation (DevTools or the worker logs) to confirm where the ~1s actually goes before changing code — levers are ranked by expected impact, not measured. Lever #1 (the per-nav auth callback POST) is the prime suspect; confirm it. Land the safe, obviously-correct cuts (#2, #5) regardless. Treat #3 as a product call on acceptable data staleness.

Done

  • A captured before/after network trace (or worker timing log) of one sidebar navigation showing the loader critical-path time dropped meaningfully (target: cold authed navigation well under the current ~1s; name the measured number).
  • /v1/auth/web/callback per-navigation cost is either eliminated on revisit (cached) or documented as necessarily blocking with the reason.
  • Dashboard loader fetches only what it renders.
  • No regression in data freshness that matters (mutations still reflect immediately via invalidation; verify create-key / publish flows still show new rows).
  • pnpm test + web component tests green.

Relationship

Companion to AP-255 (the loading-indicator affordance). This ticket is the underlying speed; AP-255 is the perceived responsiveness. They can land independently.

Review in Linear

@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 0405e6c4-0372-433f-aedd-78c1f1296d61

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ap-256-email-claim-fast-path

Comment @coderabbitai help to get the list of available commands and usage tips.

@isuttell isuttell enabled auto-merge (squash) June 6, 2026 16:23
@isuttell isuttell merged commit e96f4c4 into main Jun 6, 2026
10 checks passed
@isuttell isuttell deleted the ap-256-email-claim-fast-path branch June 6, 2026 16:23
@github-actions

github-actions Bot commented Jun 6, 2026

Copy link
Copy Markdown

agent-paste PR preview resources were cleaned up. The shared Preview GitHub Environment is retained for future preview deploys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant