diff --git a/.context/app/planning/development-plan.md b/.context/app/planning/development-plan.md
index dab2980fa..0979c7dd1 100644
--- a/.context/app/planning/development-plan.md
+++ b/.context/app/planning/development-plan.md
@@ -688,7 +688,7 @@ _Indicative tasks:_
 
 ### F9.1 — Production hardening
 
-_Status:_ not started · _Size:_ ~1–2 PRs · _Owner:_ TBD · _Deps:_ everything (final pass)
+_Status:_ in flight ([tracker](./features/f9.1.md)) · _Size:_ ~1–2 PRs · _Owner:_ TBD · _Deps:_ everything (final pass)
 
 The pre-ship technical hardening: concurrent-session sanity, master + sub-flag inventory, verification that every flag and sub-flag controls the right surfaces independently.
 
diff --git a/.context/app/planning/features/f9.1.md b/.context/app/planning/features/f9.1.md
new file mode 100644
index 000000000..c64063e6f
--- /dev/null
+++ b/.context/app/planning/features/f9.1.md
@@ -0,0 +1,106 @@
+---
+feature: F9.1
+title: Production hardening
+phase: P9 — Hardening + forking docs
+status: in flight
+owner: TBD
+deps: everything (final pass — P0–P8 complete)
+opened: 2026-06-09
+plan: .context/app/planning/development-plan.md#f91--production-hardening
+docs: .context/app/questionnaire/feature-flags.md
+---
+
+# F9.1 — Production hardening
+
+> Committable tracker for **F9.1**. The pre-ship technical pass: prove the questionnaire
+> surface holds up under concurrency, document the master + sub-flag matrix, and verify each
+> flag gates its own surface independently. A **verification + documentation** feature — it
+> adds no new runtime capability. Gated by `APP_QUESTIONNAIRES_ENABLED` like everything else.
+
+## Intent
+
+F9.1 is the first feature of P9 and the final hardening pass before ConQuest is demo-grade
+and fork-ready. Its job is not to build — it is to **prove** what P0–P8 built: that 20+
+concurrent respondent sessions don't deadlock, orphan turns, or drop audit writes; that the
+eleven feature flags gate exactly the surfaces they claim and nothing else; and that the
+full respondent happy path still composes end-to-end. The deliverables are a smoke harness, a
+flag-inventory doc, a flag-verification test suite, and a green integration pass.
+
+## Decisions (confirmed with the user)
+
+- **Plan-spec-only scope.** Stay strictly within the plan's four indicative tasks
+  (concurrency smoke, flag inventory, per-flag verification, happy-path pass). Gaps that
+  exploration surfaced — no IP-keyed rate limit on anonymous session creation, no data
+  retention, no per-version spend cap — are **out of scope**, documented as follow-ups (see
+  below), not built. Keeps F9.1 a clean gate, not a grab-bag.
+- **Concurrency smoke against the real DB.** The "no deadlocks / orphan turns / missed audit
+  writes" invariants are a real-Postgres concern; the house integration tests mock Prisma and
+  can't catch a transaction race. So the concurrency check is a **smoke script** (`scripts/smoke/`)
+  against the dev DB, not a vitest test.
+- **LLM stubbed by construction, not via a fake provider.** The orchestrator's paid compute
+  only _produces_ the `AnswerSlotIntent`s that `persistTurn` writes; the smoke feeds those
+  intents directly, so there is no LLM in the loop to stub. Cleaner and more deterministic
+  than wiring a multi-schema fake provider through `registerProviderInstance` (the mechanism
+  the other smoke scripts use). The smoke drives the **real** persistence seams
+  (`createAnonymousSession` → `persistTurn` → `markSessionCompleted`), which is where the
+  concurrency risk actually lives.
+
+## Build shape (branch `feat/F9.1-production-hardening`)
+
+- **Concurrency + happy-path smoke** — `scripts/smoke/concurrent-sessions.ts`,
+  `npm run smoke:concurrent-sessions`. Seeds one launched, anonymous-mode version (4 free-text
+  slots), then creates **24 sessions concurrently**, runs 4 turns each through the live
+  persistence seams, and completes them. Asserts: no rejected promise / deadlock (40P01); each
+  session has exactly its 4 turns with contiguous ordinals (no orphan/dropped turns); every
+  answered slot is back-stamped with a real turn id; each session has exactly one `created` and
+  one `completed` `AppQuestionnaireSessionEvent` (no missed audit writes). `--single` runs one
+  verbose happy-path session plus a F8.2 results-export read — the plan's "final happy-path
+  integration pass" stitched journey. Everything hangs off one `smoke-test-f91` questionnaire;
+  cleanup cascades the graph. Idempotent. Registered in `scripts/smoke/README.md`.
+- **Flag inventory doc** — `.context/app/questionnaire/feature-flags.md`. The authoritative
+  matrix: master + 10 sub-flags, each flag's `feature_flag` name, resolver, required parents,
+  gated surface, and off-behaviour. States the two design rules (disabled surface **404s** not
+  401s; a sub-flag requires its parents) and the **three off-behaviour shapes** (route-404 /
+  degrade / behaviour-inside-route). Calls out that the flags are **DB rows, not env vars**.
+  Linked from the namespace `README.md` index.
+- **Per-sub-flag verification suite** — `tests/unit/lib/app/questionnaire/feature-flag.test.ts`
+  (extended from master-only to all eleven resolvers). Data-driven truth tables: each resolver
+  is true only when all required flags are on, false when any one (master / live-sessions /
+  its own sub-flag) is off. Plus **independence** (one sub-flag off suppresses only its own
+  resolver, every sibling stays true), the **live-sessions cascade** (parent off closes the
+  voice/attachment/cost-cap trio), the **master transitive close**, and the `ensure*` route-gate
+  404-envelope contract. 54 tests. Per-route gating stays covered by each route's own
+  `route.test.ts`; this suite is the consolidated matrix check.
+
+## Feature-flag matrix
+
+The canonical inventory — every flag, its dependency chain, and its off-behaviour — lives in
+[`../../app/questionnaire/feature-flags.md`](../../app/questionnaire/feature-flags.md).
+
+## Verification
+
+- `npm run smoke:concurrent-sessions` — 24 sessions · 96 turns · invariants reconciled; runs
+  twice in a row clean (idempotent); leaves no `smoke-test-f91` rows.
+- `npm run smoke:concurrent-sessions -- --single` — happy path reaches a completed session +
+  a non-empty results export.
+- `npx vitest run tests/unit/lib/app/questionnaire/feature-flag.test.ts` — 54 pass.
+- `npm run validate` clean; the app suites (`tests/unit|integration/.../app/**`) green (1780
+  tests) — the final integration pass.
+
+## Out of scope (documented follow-ups, not built)
+
+Surfaced during F9.1 exploration; deliberately deferred to keep F9.1 verification-only. Each
+is a real feature in its own right:
+
+- **IP-keyed rate limit on anonymous session creation** — the no-login
+  `questionnaire-sessions/anonymous` path is keyed per-session; a global `anon:IP` cap on
+  session _creation_ would harden it against session-minting abuse.
+- **Data retention / purge** — completed sessions, turns (respondent PII), and answer slots are
+  kept indefinitely; there is no time-based purge.
+- **Per-version / per-admin spend cap** — only a per-session `costBudgetUsd` exists; an
+  expensive launched version can accrue unbounded spend across many sessions.
+
+## No CHANGELOG entry
+
+F9.1 touches only app-owned smoke/test/docs — no Sunrise platform surface. Per the repo's
+platform-scoped CHANGELOG policy, it adds no `CHANGELOG.md` bullet.
diff --git a/.context/app/questionnaire/README.md b/.context/app/questionnaire/README.md
index 7cfaa2106..8a27cbf67 100644
--- a/.context/app/questionnaire/README.md
+++ b/.context/app/questionnaire/README.md
@@ -28,6 +28,7 @@ plan and feature trackers, see [`../planning/`](../planning/); for the platform
 | [`cost-cap-enforcement.md`](./cost-cap-enforcement.md)       | Per-session USD budget at the turn boundary — soft wrap-up nudge at 90%, hard 402 + auto-pause at 100%, summed turn cost, dark-launch flag (F6.3)                                  |
 | [`answer-slot-panel.md`](./answer-slot-panel.md)             | The live respondent answer panel beside the chat — `GET …/answers` read endpoint, scope config, confidence language, Revisit wiring (F7.2)                                         |
 | [`anonymous-mode.md`](./anonymous-mode.md)                   | The cross-surface PII contract — per-surface gates, the profile snapshot rule, k-anonymity suppression, erasure cascade (F8.3)                                                     |
+| [`feature-flags.md`](./feature-flags.md)                     | The master + 10 sub-flag gate matrix — what each flag gates, its dependency chain, and the three off-behaviour shapes (404 / degrade / behaviour-inside-route) (F9.1)              |
 
 ## Where the code lives
 
diff --git a/.context/app/questionnaire/feature-flags.md b/.context/app/questionnaire/feature-flags.md
new file mode 100644
index 000000000..88cf03921
--- /dev/null
+++ b/.context/app/questionnaire/feature-flags.md
@@ -0,0 +1,80 @@
+# Feature-flag inventory — the questionnaire gate matrix (F9.1)
+
+The questionnaire product dark-launches behind **one master flag and ten sub-flags**. This
+is the authoritative inventory: every flag, what it gates, what it depends on, and exactly
+what a respondent or admin sees when it is **off**. It is the reference the F9.1 hardening
+pass verifies (`tests/unit/lib/app/questionnaire/feature-flag.test.ts`) and the runbook
+(F9.2) toggles against.
+
+The flag resolvers live in [`lib/app/questionnaire/feature-flag.ts`](../../../lib/app/questionnaire/feature-flag.ts);
+the canonical flag-name constants live in the dependency-light
+[`constants.ts`](../../../lib/app/questionnaire/constants.ts) (so the seed can import a name
+without the resolver's HTTP/DB deps).
+
+## They are DB rows, not env vars
+
+> ⚠️ **`APP_QUESTIONNAIRES_*_ENABLED` are `feature_flag` table rows, not environment
+> variables.** The name _looks_ like an env var; it is not. Every resolver is a thin
+> wrapper over Sunrise's `isFeatureEnabled(name)`, which reads the `feature_flag` table.
+
+Toggle a flag by writing its row (admin feature-flag surface / seed / a direct DB update),
+**not** by setting a shell variable. A flag with no row resolves to its seeded default. This
+matters for the runbook and for any "turn X off and confirm the surface disappears" check —
+you are flipping a row, and the change is live without a redeploy.
+
+## The two design rules
+
+1. **A disabled surface 404s — it does not 401.** Every route-level gate runs **before**
+   auth (`withQuestionnairesEnabled` / `withLiveSessionsEnabled` / `withVoiceInputEnabled`
+   wrap the handler so the gate fires first). A switched-off feature is therefore
+   indistinguishable from a route that was never built — no information leaks about a
+   feature that exists but is dark. Never place a gate after `withAdminAuth`/`withAuth`.
+
+2. **A sub-flag requires its parents.** Every sub-flag resolver `AND`s the master flag (and,
+   for the live-dependent trio, the live-sessions flag) — so turning a parent off
+   transitively closes every child, and no child can run with its parent dark.
+
+## The matrix
+
+`is*Enabled()` returns `true` only when **all** the flags in its "Requires" column are on.
+
+| #   | Flag (`feature_flag` name)                           | Resolver                                                  | Requires                   | Gates                                                                                                | Off-behaviour                                                                                                                                                           |
+| --- | ---------------------------------------------------- | --------------------------------------------------------- | -------------------------- | ---------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| 0   | `APP_QUESTIONNAIRES_ENABLED`                         | `isQuestionnairesEnabled` / `ensureQuestionnairesEnabled` | — (master)                 | **the entire app** — every `/api/v1/app/**` route, every admin + respondent surface                  | every questionnaire route **404s**; the whole product is invisible                                                                                                      |
+| 1   | `APP_QUESTIONNAIRES_ADAPTIVE_STRATEGY_ENABLED`       | `isAdaptiveSelectionEnabled`                              | master                     | F4.1 adaptive (embedding + LLM) next-question selection                                              | a version set to `adaptive` **degrades to `weighted`** (no 404 — selection still runs, just cheaper)                                                                    |
+| 2   | `APP_QUESTIONNAIRES_ANSWER_EXTRACTION_ENABLED`       | `isAnswerExtractionEnabled`                               | master                     | F4.2 answer-extraction preview route                                                                 | route **404s**                                                                                                                                                          |
+| 3   | `APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_ENABLED` | `isContradictionDetectionEnabled`                         | master                     | F4.3 contradiction-detection preview route                                                           | route **404s**                                                                                                                                                          |
+| 4   | `APP_QUESTIONNAIRES_ANSWER_REFINEMENT_ENABLED`       | `isAnswerRefinementEnabled`                               | master                     | F4.4 answer-refinement preview route                                                                 | route **404s**                                                                                                                                                          |
+| 5   | `APP_QUESTIONNAIRES_COMPLETION_ENABLED`              | `isCompletionEnabled`                                     | master                     | F4.5 completion-offer **phrasing** (the LLM prose)                                                   | the completion-status route returns the deterministic **assessment with no composed offer** (no 404 — the free assessment is always available under the master flag)    |
+| 6   | `APP_QUESTIONNAIRES_DESIGN_EVALUATION_ENABLED`       | `isDesignEvaluationEnabled`                               | master                     | F5.1 seven-judge design-evaluation preview route                                                     | route **404s** (the whole route is paid LLM work — no free fallback)                                                                                                    |
+| 7   | `APP_QUESTIONNAIRES_LIVE_SESSIONS_ENABLED`           | `isLiveSessionsEnabled` / `ensureLiveSessionsEnabled`     | master                     | F6.1 respondent surface — session-create + `/messages` turn loop (incl. the no-login anonymous path) | session-create and messages routes **404**; the respondent surface disappears                                                                                           |
+| 8   | `APP_QUESTIONNAIRES_VOICE_INPUT_ENABLED`             | `isVoiceInputEnabled` / `ensureVoiceInputEnabled`         | master **+ live-sessions** | F6.2 voice transcribe route                                                                          | route **404s** (a transcript is useless without the live turn loop, so voice is gated behind live-sessions, not merely beside it)                                       |
+| 9   | `APP_QUESTIONNAIRES_ATTACHMENT_INPUT_ENABLED`        | `isAttachmentInputEnabled`                                | master **+ live-sessions** | respondent image/document attachments on a `/messages` turn                                          | the chat hides the attach affordance and the `/messages` route **ignores any attachments** a client sends (no 404 — it gates a behaviour inside an already-gated route) |
+| 10  | `APP_QUESTIONNAIRES_COST_CAP_ENABLED`                | `isCostCapEnforcementEnabled`                             | master **+ live-sessions** | F6.3 per-session USD budget check at the turn boundary                                               | turns run with **no budget check** even when a version sets `costBudgetUsd` (no 404 — it gates a behaviour inside the messages route)                                   |
+
+## The three off-behaviour shapes
+
+Reading the table, every sub-flag falls into one of three shapes — know which one you are
+verifying:
+
+- **Route 404** (flags 2, 3, 4, 6, 7, 8) — the gated route is paid LLM work or a whole
+  surface; off ⇒ the route returns 404 via its `ensure*`/`with*` wrapper.
+- **Degrade** (flags 1, 5) — a cheaper deterministic result stands in: adaptive → weighted;
+  composed offer → bare assessment. The route still responds.
+- **Behaviour-inside-route** (flags 9, 10) — there is no route to 404; the flag toggles a
+  branch inside an already-gated route (attachments ignored; budget check skipped).
+
+When verifying "with each off, the gated surface is suppressed and the rest is unaffected",
+assert against the shape: a 404 for the first group, the fallback result for the second, the
+absent side-effect for the third.
+
+## Verification
+
+- **Resolver truth tables** — `tests/unit/lib/app/questionnaire/feature-flag.test.ts` pins,
+  for every resolver, that it is `true` only when all required flags are on and `false` when
+  the master, the sub-flag, or (for the live trio) live-sessions is off.
+- **Independence** — the same suite asserts a representative route behind each gate is
+  suppressed when its flag is off while a sibling behind a different (still-on) flag keeps
+  responding, so flags gate their own surface and nothing else.
+- **Concurrency / happy path** — `npm run smoke:concurrent-sessions` exercises the live
+  respondent surface (flag 7) end-to-end against the real DB.
diff --git a/package.json b/package.json
index f4fb4d885..539406d79 100644
--- a/package.json
+++ b/package.json
@@ -35,6 +35,7 @@
     "db:reset": "prisma migrate reset --force",
     "db:drift-check": "tsx --env-file=.env.local scripts/db/check-drift.ts",
     "smoke:chat": "tsx --env-file=.env.local scripts/smoke/chat.ts",
+    "smoke:concurrent-sessions": "tsx --env-file=.env.local scripts/smoke/concurrent-sessions.ts",
     "smoke:orchestration": "tsx --env-file=.env.local scripts/smoke/orchestration.ts",
     "smoke:hybrid-search": "tsx --env-file=.env.local scripts/smoke/knowledge-hybrid-search.ts",
     "smoke:transcribe": "tsx --env-file=.env.local scripts/smoke/transcribe.ts",
diff --git a/scripts/smoke/README.md b/scripts/smoke/README.md
index 0fc18bc89..e724df09c 100644
--- a/scripts/smoke/README.md
+++ b/scripts/smoke/README.md
@@ -90,11 +90,12 @@ Prefer numbered `[n] description` stdout markers over ad-hoc logging — it make
 
 ## Current scripts
 
-| Script             | Exercises                                                                                                                                                                                          | Stubs                                                                                                        | Notes                                                                                                                                                                 |
-| ------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `chat.ts`          | `streamChat` → tool loop → persistence                                                                                                                                                             | `LlmProvider` via `registerProviderInstance`                                                                 | Verifies event sequence, `AiMessage` + `AiCostLog` writes, and budget check. Doesn't exercise the tool loop live (needs seeded capability rows).                      |
-| `orchestration.ts` | Phase 3 admin HTTP surface: providers/agents/capabilities/workflows CRUD + validate + execute stub, chat SSE, knowledge upload + search, evaluations complete, conversations clear, costs + budget | In-process Node HTTP server stubs OpenAI-compatible `/v1/chat/completions` (JSON + SSE) and `/v1/embeddings` | Requires the dev server running (`npm run dev`, default `PORT=3001`). Hits real Postgres. Successive runs within 60s may hit admin rate limit — wait out the window.  |
-| `transcribe.ts`    | `getAudioProvider()` resolution + `provider.transcribe()` round-trip with a silent WAV                                                                                                             | Fake audio `LlmProvider` via `registerProviderInstance` (returns a scripted transcript)                      | Seeds a scoped `smoke-test-audio` `AiProviderModel` row with `capabilities: ['audio']`. Proves the audio plumbing wires up end-to-end without a real Whisper API key. |
+| Script                   | Exercises                                                                                                                                                                                                                                                                                              | Stubs                                                                                                                                       | Notes                                                                                                                                                                 |
+| ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `chat.ts`                | `streamChat` → tool loop → persistence                                                                                                                                                                                                                                                                 | `LlmProvider` via `registerProviderInstance`                                                                                                | Verifies event sequence, `AiMessage` + `AiCostLog` writes, and budget check. Doesn't exercise the tool loop live (needs seeded capability rows).                      |
+| `orchestration.ts`       | Phase 3 admin HTTP surface: providers/agents/capabilities/workflows CRUD + validate + execute stub, chat SSE, knowledge upload + search, evaluations complete, conversations clear, costs + budget                                                                                                     | In-process Node HTTP server stubs OpenAI-compatible `/v1/chat/completions` (JSON + SSE) and `/v1/embeddings`                                | Requires the dev server running (`npm run dev`, default `PORT=3001`). Hits real Postgres. Successive runs within 60s may hit admin rate limit — wait out the window.  |
+| `transcribe.ts`          | `getAudioProvider()` resolution + `provider.transcribe()` round-trip with a silent WAV                                                                                                                                                                                                                 | Fake audio `LlmProvider` via `registerProviderInstance` (returns a scripted transcript)                                                     | Seeds a scoped `smoke-test-audio` `AiProviderModel` row with `capabilities: ['audio']`. Proves the audio plumbing wires up end-to-end without a real Whisper API key. |
+| `concurrent-sessions.ts` | F9.1 hardening: 24 concurrent respondent sessions × 4 turns through the live persistence seams (`createAnonymousSession` → `persistTurn` → `markSessionCompleted`). Asserts no deadlocks / orphan turns / missed audit writes. `--single` runs one verbose happy-path session + a results-export read. | LLM stubbed **by construction** — feeds `persistTurn` the extraction intents the orchestrator would emit, so no provider instance is needed | Everything hangs off one `smoke-test-f91` `AppQuestionnaire`; cleanup cascades the whole graph. Idempotent.                                                           |
 
 ## Adding a new smoke script
 
diff --git a/scripts/smoke/concurrent-sessions.ts b/scripts/smoke/concurrent-sessions.ts
new file mode 100644
index 000000000..2e1b9854e
--- /dev/null
+++ b/scripts/smoke/concurrent-sessions.ts
@@ -0,0 +1,369 @@
+/**
+ * Concurrent-session sanity smoke script (F9.1 production hardening)
+ *
+ * Drives the live respondent **persistence seams** under concurrency against the real
+ * Postgres dev DB, to prove the three invariants F9.1 names:
+ *
+ *   - **no deadlocks** — N sessions create + turn + complete concurrently with no rejected
+ *     promise / Postgres deadlock (40P01) / serialization failure;
+ *   - **no orphan turns** — every persisted turn maps to a live session, every session has
+ *     exactly the turns it drove, and ordinals are contiguous 1..K per session;
+ *   - **no missed audit writes** — every session has its `created` + `completed`
+ *     `AppQuestionnaireSessionEvent`, every answered slot is back-stamped with a real turn
+ *     id, and the answer-slot count reconciles per session.
+ *
+ * Seams exercised (the concurrency-sensitive write paths, all `$transaction`-based):
+ *   - `createAnonymousSession`  → session row + `created` event (one tx)
+ *   - `persistTurn` (turn-run.ts) → `AppQuestionnaireTurn` + `AppAnswerSlot` upsert +
+ *      `lastUpdatedTurnId` back-stamp (one tx), the real live-turn write path
+ *   - `markSessionCompleted`    → status update + `completed` event (one tx)
+ *
+ * **LLM is stubbed by construction, not by a fake provider.** The orchestrator's paid
+ * compute (extraction/refinement/contradiction LLM calls) only *produces* the
+ * `AnswerSlotIntent`s that `persistTurn` writes; this script feeds `persistTurn` those
+ * intents directly — deterministic, free, no network — so the test isolates the DB
+ * concurrency surface the invariants are actually about. (The other smoke scripts stub via
+ * `registerProviderInstance`; here there is no LLM in the loop to stub.)
+ *
+ * Modes:
+ *   - default            → 24 concurrent sessions × 4 turns each (concurrency sanity).
+ *   - `--single` / `-1`  → one session, verbose per-stage logging + a results-export read
+ *      (F8.2) — the F9.1 "final happy-path integration pass" stitched journey.
+ *
+ * Safety:
+ *   - Everything hangs off ONE `smoke-test-f91` AppQuestionnaire; deleting it cascades to
+ *     versions, config, sections, slots, sessions, answers, events, and turns. Stale rows
+ *     from a prior run are removed before seeding and after the run. Never touches any
+ *     other data; no destructive global commands. Read `scripts/smoke/README.md` first.
+ *
+ * Run with:
+ *   npm run smoke:concurrent-sessions
+ *   npm run smoke:concurrent-sessions -- --single
+ *   # or:
+ *   npx tsx --env-file=.env.local scripts/smoke/concurrent-sessions.ts
+ */
+
+import { prisma } from '@/lib/db/client';
+import { createAnonymousSession } from '@/app/api/v1/app/questionnaire-sessions/_lib/create';
+import { persistTurn } from '@/app/api/v1/app/questionnaire-sessions/_lib/turn-run';
+import {
+  markSessionCompleted,
+  loadSessionResumeState,
+} from '@/app/api/v1/app/questionnaires/_lib/sessions';
+import { loadResultsExport } from '@/lib/app/questionnaire/export/results-loader';
+import { toResultsCsv } from '@/lib/app/questionnaire/export/results-serialize';
+import type { AnalyticsScope } from '@/lib/app/questionnaire/analytics';
+import type { AnswerSlotIntent } from '@/lib/app/questionnaire/extraction/types';
+import type { ToolCallRecord } from '@/lib/app/questionnaire/orchestrator';
+
+const MARKER = 'smoke-test-f91';
+const QUESTIONNAIRE_TITLE = `${MARKER} concurrent sessions`;
+
+const SESSIONS = 24; // "20+ concurrent-session sanity test"
+const SLOT_COUNT = 4; // questions per version
+const TURNS_PER_SESSION = SLOT_COUNT; // one answered slot per turn
+
+/** A failure detail collected during the run; a non-empty list fails the smoke. */
+const failures: string[] = [];
+function fail(msg: string): void {
+  failures.push(msg);
+  console.error(`    ✗ ${msg}`);
+}
+
+/** The seeded version graph the run drives against. */
+interface SeededVersion {
+  questionnaireId: string;
+  versionId: string;
+  /** slotKey → { id, type } for the persistTurn intent mapping. */
+  slots: { key: string; id: string }[];
+}
+
+/** Delete any AppQuestionnaire(s) left by a previous run — cascade clears the whole graph. */
+async function cleanupStale(): Promise<void> {
+  const stale = await prisma.appQuestionnaire.findMany({
+    where: { title: QUESTIONNAIRE_TITLE },
+    select: { id: true },
+  });
+  if (stale.length === 0) return;
+  await prisma.appQuestionnaire.deleteMany({ where: { title: QUESTIONNAIRE_TITLE } });
+  console.log(`    cleaned up ${stale.length} stale ${MARKER} questionnaire(s)`);
+}
+
+/** Seed a launched, anonymous-mode version with SLOT_COUNT free-text questions. */
+async function seed(): Promise<SeededVersion> {
+  const questionnaire = await prisma.appQuestionnaire.create({
+    data: {
+      title: QUESTIONNAIRE_TITLE,
+      status: 'launched',
+      versions: {
+        create: {
+          versionNumber: 1,
+          status: 'launched',
+          // anonymousMode = true so createAnonymousSession is permitted (no-login surface).
+          config: { create: { anonymousMode: true, selectionStrategy: 'sequential' } },
+          sections: {
+            create: {
+              ordinal: 1,
+              title: `${MARKER} section`,
+              questions: {
+                create: Array.from({ length: SLOT_COUNT }, (_, i) => ({
+                  versionId: '', // set below — denormalised FK needs the version id
+                  ordinal: i + 1,
+                  key: `${MARKER}-q${i + 1}`,
+                  prompt: `Smoke question ${i + 1}?`,
+                  type: 'free_text',
+                  required: true,
+                })),
+              },
+            },
+          },
+        },
+      },
+    },
+    select: {
+      id: true,
+      versions: { select: { id: true } },
+    },
+  });
+
+  const versionId = questionnaire.versions[0].id;
+
+  // AppQuestionSlot.versionId is a denormalised FK Prisma's nested create can't backfill in
+  // one shot — stamp it now so slot lookups by version work.
+  await prisma.appQuestionSlot.updateMany({
+    where: { section: { versionId } },
+    data: { versionId },
+  });
+
+  const slots = await prisma.appQuestionSlot.findMany({
+    where: { versionId },
+    orderBy: { ordinal: 'asc' },
+    select: { id: true, key: true },
+  });
+
+  return { questionnaireId: questionnaire.id, versionId, slots };
+}
+
+/** The deterministic extraction intent the orchestrator would have produced for one slot. */
+function intentForSlot(slotKey: string, turnIndex: number): AnswerSlotIntent {
+  return {
+    slotKey,
+    questionType: 'free_text',
+    value: `answer-${turnIndex}`,
+    confidence: 0.9,
+    provenance: 'direct',
+    rationale: 'smoke deterministic answer',
+    isActiveQuestion: true,
+  };
+}
+
+/** Run TURNS_PER_SESSION sequential turns over one session (ordinal depends on count). */
+async function runSession(sessionId: string, seeded: SeededVersion): Promise<void> {
+  const keyToSlotId = new Map(seeded.slots.map((s) => [s.key, s.id]));
+  for (let t = 0; t < TURNS_PER_SESSION; t++) {
+    const slot = seeded.slots[t];
+    const toolCalls: ToolCallRecord[] = [{ slug: 'extract_answer_slots', success: true }];
+    await persistTurn({
+      sessionId,
+      userMessage: `respondent message ${t + 1}`,
+      agentResponse: `agent reply ${t + 1}`,
+      targetedQuestionId: slot.id,
+      toolCalls,
+      costUsd: 0.001,
+      upserts: [intentForSlot(slot.key, t + 1)],
+      refinements: [],
+      keyToSlotId,
+    });
+  }
+  await markSessionCompleted(sessionId);
+}
+
+/** Assert the three invariants over the persisted graph for the given session ids. */
+async function verify(seeded: SeededVersion, sessionIds: string[]): Promise<void> {
+  const sessions = await prisma.appQuestionnaireSession.findMany({
+    where: { versionId: seeded.versionId, isPreview: false },
+    select: {
+      id: true,
+      status: true,
+      turns: { select: { id: true, ordinal: true }, orderBy: { ordinal: 'asc' } },
+      answers: { select: { id: true, lastUpdatedTurnId: true } },
+      events: { select: { eventType: true } },
+    },
+  });
+
+  // No orphan/extra sessions.
+  if (sessions.length !== sessionIds.length) {
+    fail(`expected ${sessionIds.length} sessions, found ${sessions.length}`);
+  }
+
+  const allTurnIds = new Set<string>();
+  for (const s of sessions) {
+    // Completed status (markSessionCompleted ran).
+    if (s.status !== 'completed')
+      fail(`session ${s.id} status is "${s.status}", expected completed`);
+
+    // No orphan turns: exactly TURNS_PER_SESSION, ordinals contiguous 1..K.
+    if (s.turns.length !== TURNS_PER_SESSION) {
+      fail(`session ${s.id} has ${s.turns.length} turns, expected ${TURNS_PER_SESSION}`);
+    }
+    s.turns.forEach((turn, i) => {
+      if (turn.ordinal !== i + 1)
+        fail(`session ${s.id} turn #${i} ordinal=${turn.ordinal}, expected ${i + 1}`);
+      allTurnIds.add(turn.id);
+    });
+
+    // Answer-slot reconciliation: one answer per slot, each back-stamped with a real turn.
+    if (s.answers.length !== SLOT_COUNT) {
+      fail(`session ${s.id} has ${s.answers.length} answers, expected ${SLOT_COUNT}`);
+    }
+    const sessionTurnIds = new Set(s.turns.map((t) => t.id));
+    for (const a of s.answers) {
+      if (!a.lastUpdatedTurnId) {
+        fail(`session ${s.id} answer ${a.id} has no lastUpdatedTurnId (missed turn back-stamp)`);
+      } else if (!sessionTurnIds.has(a.lastUpdatedTurnId)) {
+        fail(
+          `session ${s.id} answer ${a.id} back-stamped with foreign turn ${a.lastUpdatedTurnId}`
+        );
+      }
+    }
+
+    // No missed audit writes: exactly one `created` and one `completed` event.
+    const created = s.events.filter((e) => e.eventType === 'created').length;
+    const completed = s.events.filter((e) => e.eventType === 'completed').length;
+    if (created !== 1) fail(`session ${s.id} has ${created} created events, expected 1`);
+    if (completed !== 1) fail(`session ${s.id} has ${completed} completed events, expected 1`);
+  }
+
+  // No orphan turns globally: every turn belongs to one of our sessions (no extras/dupes).
+  const totalTurns = await prisma.appQuestionnaireTurn.count({
+    where: { session: { versionId: seeded.versionId } },
+  });
+  const expectedTurns = sessionIds.length * TURNS_PER_SESSION;
+  if (totalTurns !== expectedTurns) {
+    fail(`total turns=${totalTurns}, expected ${expectedTurns} (orphan or dropped turns)`);
+  }
+  if (allTurnIds.size !== expectedTurns) {
+    fail(`distinct turn ids=${allTurnIds.size}, expected ${expectedTurns}`);
+  }
+
+  console.log(
+    `    ✓ ${sessions.length} sessions · ${totalTurns} turns · audit events + answer back-stamps reconciled`
+  );
+}
+
+/** The F9.1 happy-path stitched journey: one session, verbose, plus a results-export read. */
+async function runHappyPath(seeded: SeededVersion): Promise<void> {
+  console.log('\n[journey] single happy-path session');
+  const create = await createAnonymousSession(seeded.versionId);
+  if (!create.ok) {
+    fail(`createAnonymousSession failed: ${create.code} ${create.message}`);
+    return;
+  }
+  const sessionId = create.session.id;
+  console.log(`  • created session ${sessionId} (status=${create.session.status})`);
+
+  await runSession(sessionId, seeded);
+  console.log(`  • ran ${TURNS_PER_SESSION} turns + completed`);
+
+  const resume = await loadSessionResumeState(sessionId);
+  console.log(
+    `  • resume state: status=${resume.status}, ${resume.answeredSlots.length} answers captured`
+  );
+  if (resume.status !== 'completed')
+    fail(`journey session status=${resume.status}, expected completed`);
+  if (resume.answeredSlots.length !== SLOT_COUNT) {
+    fail(`journey captured ${resume.answeredSlots.length} answers, expected ${SLOT_COUNT}`);
+  }
+
+  // F8.2 results export — the journey's final stage. Wide window to capture the just-completed session.
+  const scope: AnalyticsScope = {
+    versionId: seeded.versionId,
+    from: new Date('2000-01-01T00:00:00.000Z'),
+    to: new Date('2999-01-01T00:00:00.000Z'),
+    tagIds: [],
+  };
+  const exportModel = await loadResultsExport(scope);
+  if (!exportModel) {
+    fail('loadResultsExport returned null for the seeded version');
+    return;
+  }
+  const csv = toResultsCsv(exportModel);
+  const csvRows = csv.trim().split('\n').length - 1; // minus header
+  console.log(
+    `  • export: ${exportModel.sessions.length} session(s), ${exportModel.questions.length} questions, ${csvRows} CSV row(s)`
+  );
+  if (exportModel.sessions.length < 1) fail('export has no completed sessions');
+}
+
+async function main(): Promise<void> {
+  const single = process.argv.includes('--single') || process.argv.includes('-1');
+
+  console.log(`\n[1] cleanup stale ${MARKER} rows`);
+  await cleanupStale();
+
+  console.log('[2] seed launched anonymous-mode version');
+  const seeded = await seed();
+  console.log(
+    `    questionnaire ${seeded.questionnaireId} · version ${seeded.versionId} · ${seeded.slots.length} slots`
+  );
+
+  if (single) {
+    await runHappyPath(seeded);
+  } else {
+    console.log(`\n[3] create ${SESSIONS} sessions concurrently`);
+    const creates = await Promise.allSettled(
+      Array.from({ length: SESSIONS }, () => createAnonymousSession(seeded.versionId))
+    );
+    const sessionIds: string[] = [];
+    creates.forEach((r, i) => {
+      if (r.status === 'rejected') {
+        fail(`session create #${i} rejected: ${String(r.reason)}`);
+      } else if (!r.value.ok) {
+        fail(`session create #${i} failed: ${r.value.code} ${r.value.message}`);
+      } else {
+        sessionIds.push(r.value.session.id);
+      }
+    });
+    console.log(`    ✓ ${sessionIds.length}/${SESSIONS} sessions created`);
+
+    console.log(`[4] run ${TURNS_PER_SESSION} turns × ${sessionIds.length} sessions concurrently`);
+    const runs = await Promise.allSettled(sessionIds.map((id) => runSession(id, seeded)));
+    runs.forEach((r, i) => {
+      if (r.status === 'rejected') {
+        // A Postgres deadlock (40P01) or serialization failure surfaces here.
+        fail(`session run #${i} (${sessionIds[i]}) rejected: ${String(r.reason)}`);
+      }
+    });
+    console.log(
+      `    ✓ ${runs.filter((r) => r.status === 'fulfilled').length}/${runs.length} session runs settled`
+    );
+
+    console.log('[5] verify invariants (no deadlocks / orphan turns / missed audit writes)');
+    await verify(seeded, sessionIds);
+  }
+
+  console.log('\n[6] cleanup (scoped — cascade from the seeded questionnaire)');
+  const deleted = await prisma.appQuestionnaire.deleteMany({
+    where: { id: seeded.questionnaireId },
+  });
+  console.log(`    deleted ${deleted.count} questionnaire (cascade cleared the graph)`);
+
+  await prisma.$disconnect();
+
+  if (failures.length > 0) {
+    console.error(`\n✗ smoke FAILED with ${failures.length} invariant violation(s)`);
+    process.exit(1);
+  }
+  console.log('\n✓ concurrent-session smoke passed');
+}
+
+main().catch(async (err) => {
+  console.error('\n✗ smoke script failed:', err);
+  try {
+    await prisma.appQuestionnaire.deleteMany({ where: { title: QUESTIONNAIRE_TITLE } });
+    await prisma.$disconnect();
+  } catch {
+    /* ignore */
+  }
+  process.exit(1);
+});
diff --git a/tests/unit/lib/app/questionnaire/feature-flag.test.ts b/tests/unit/lib/app/questionnaire/feature-flag.test.ts
index f30c89211..bac2d2cee 100644
--- a/tests/unit/lib/app/questionnaire/feature-flag.test.ts
+++ b/tests/unit/lib/app/questionnaire/feature-flag.test.ts
@@ -1,9 +1,35 @@
+import { NextRequest } from 'next/server';
 import { describe, it, expect, vi, beforeEach } from 'vitest';
 
 import {
   APP_QUESTIONNAIRES_FLAG,
+  APP_QUESTIONNAIRES_ADAPTIVE_FLAG,
+  APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG,
+  APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_FLAG,
+  APP_QUESTIONNAIRES_ANSWER_REFINEMENT_FLAG,
+  APP_QUESTIONNAIRES_COMPLETION_FLAG,
+  APP_QUESTIONNAIRES_DESIGN_EVALUATION_FLAG,
+  APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+  APP_QUESTIONNAIRES_VOICE_INPUT_FLAG,
+  APP_QUESTIONNAIRES_COST_CAP_FLAG,
+  APP_QUESTIONNAIRES_ATTACHMENT_INPUT_FLAG,
   ensureQuestionnairesEnabled,
+  ensureLiveSessionsEnabled,
+  ensureVoiceInputEnabled,
+  withQuestionnairesEnabled,
+  withLiveSessionsEnabled,
+  withVoiceInputEnabled,
   isQuestionnairesEnabled,
+  isAdaptiveSelectionEnabled,
+  isAnswerExtractionEnabled,
+  isContradictionDetectionEnabled,
+  isAnswerRefinementEnabled,
+  isCompletionEnabled,
+  isDesignEvaluationEnabled,
+  isLiveSessionsEnabled,
+  isVoiceInputEnabled,
+  isAttachmentInputEnabled,
+  isCostCapEnforcementEnabled,
 } from '@/lib/app/questionnaire/feature-flag';
 import { isFeatureEnabled } from '@/lib/feature-flags';
 
@@ -13,51 +39,411 @@ vi.mock('@/lib/feature-flags', () => ({
 
 const mockedIsFeatureEnabled = vi.mocked(isFeatureEnabled);
 
-describe('questionnaire feature flag', () => {
-  beforeEach(() => {
-    vi.clearAllMocks();
+/**
+ * Drive {@link isFeatureEnabled} from a per-flag map: a flag is enabled iff its name maps
+ * to `true`. The resolvers call `isFeatureEnabled(name)` (often in a `Promise.all`), so this
+ * lets each test set exactly which flags are on and assert the resolver's AND logic.
+ */
+function setFlags(enabled: Record<string, boolean>): void {
+  mockedIsFeatureEnabled.mockImplementation((name: string) =>
+    Promise.resolve(enabled[name] === true)
+  );
+}
+
+/** All eleven flag names, used to build "everything on" baselines. */
+const ALL_FLAGS = [
+  APP_QUESTIONNAIRES_FLAG,
+  APP_QUESTIONNAIRES_ADAPTIVE_FLAG,
+  APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG,
+  APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_FLAG,
+  APP_QUESTIONNAIRES_ANSWER_REFINEMENT_FLAG,
+  APP_QUESTIONNAIRES_COMPLETION_FLAG,
+  APP_QUESTIONNAIRES_DESIGN_EVALUATION_FLAG,
+  APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+  APP_QUESTIONNAIRES_VOICE_INPUT_FLAG,
+  APP_QUESTIONNAIRES_COST_CAP_FLAG,
+  APP_QUESTIONNAIRES_ATTACHMENT_INPUT_FLAG,
+] as const;
+
+/** A map with every flag on (the baseline each truth-table test perturbs from). */
+function allOn(): Record<string, boolean> {
+  return Object.fromEntries(ALL_FLAGS.map((f) => [f, true]));
+}
+
+beforeEach(() => {
+  vi.clearAllMocks();
+});
+
+describe('questionnaire feature flag — flag names are stable', () => {
+  // The seed and any external toggling rely on the exact `feature_flag` row names; guard
+  // them so a rename can't silently dark-launch (or un-gate) a surface.
+  it('master + sub-flag names match their published constants', () => {
+    expect(APP_QUESTIONNAIRES_FLAG).toBe('APP_QUESTIONNAIRES_ENABLED');
+    expect(APP_QUESTIONNAIRES_ADAPTIVE_FLAG).toBe('APP_QUESTIONNAIRES_ADAPTIVE_STRATEGY_ENABLED');
+    expect(APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG).toBe(
+      'APP_QUESTIONNAIRES_ANSWER_EXTRACTION_ENABLED'
+    );
+    expect(APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_FLAG).toBe(
+      'APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_ENABLED'
+    );
+    expect(APP_QUESTIONNAIRES_ANSWER_REFINEMENT_FLAG).toBe(
+      'APP_QUESTIONNAIRES_ANSWER_REFINEMENT_ENABLED'
+    );
+    expect(APP_QUESTIONNAIRES_COMPLETION_FLAG).toBe('APP_QUESTIONNAIRES_COMPLETION_ENABLED');
+    expect(APP_QUESTIONNAIRES_DESIGN_EVALUATION_FLAG).toBe(
+      'APP_QUESTIONNAIRES_DESIGN_EVALUATION_ENABLED'
+    );
+    expect(APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG).toBe('APP_QUESTIONNAIRES_LIVE_SESSIONS_ENABLED');
+    expect(APP_QUESTIONNAIRES_VOICE_INPUT_FLAG).toBe('APP_QUESTIONNAIRES_VOICE_INPUT_ENABLED');
+    expect(APP_QUESTIONNAIRES_COST_CAP_FLAG).toBe('APP_QUESTIONNAIRES_COST_CAP_ENABLED');
+    expect(APP_QUESTIONNAIRES_ATTACHMENT_INPUT_FLAG).toBe(
+      'APP_QUESTIONNAIRES_ATTACHMENT_INPUT_ENABLED'
+    );
   });
+});
 
-  describe('isQuestionnairesEnabled', () => {
-    it('delegates to isFeatureEnabled with the APP_QUESTIONNAIRES_ENABLED flag', async () => {
-      mockedIsFeatureEnabled.mockResolvedValue(true);
+describe('isQuestionnairesEnabled (master)', () => {
+  it('delegates to isFeatureEnabled with the master flag', async () => {
+    setFlags({ [APP_QUESTIONNAIRES_FLAG]: true });
+    await expect(isQuestionnairesEnabled()).resolves.toBe(true);
+    expect(mockedIsFeatureEnabled).toHaveBeenCalledWith(APP_QUESTIONNAIRES_FLAG);
+  });
+
+  it('returns false when the master flag is disabled', async () => {
+    setFlags({ [APP_QUESTIONNAIRES_FLAG]: false });
+    await expect(isQuestionnairesEnabled()).resolves.toBe(false);
+  });
+});
+
+/**
+ * The data-driven truth table for every sub-flag resolver: each is `true` only when ALL its
+ * required flags are on, and `false` when ANY one of them is off. `requires` lists the flags
+ * the resolver AND's together (master first, then any parents, then its own sub-flag).
+ */
+const SUB_FLAG_RESOLVERS: ReadonlyArray<{
+  name: string;
+  fn: () => Promise<boolean>;
+  requires: readonly string[];
+}> = [
+  {
+    name: 'isAdaptiveSelectionEnabled',
+    fn: isAdaptiveSelectionEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_ADAPTIVE_FLAG],
+  },
+  {
+    name: 'isAnswerExtractionEnabled',
+    fn: isAnswerExtractionEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG],
+  },
+  {
+    name: 'isContradictionDetectionEnabled',
+    fn: isContradictionDetectionEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_FLAG],
+  },
+  {
+    name: 'isAnswerRefinementEnabled',
+    fn: isAnswerRefinementEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_ANSWER_REFINEMENT_FLAG],
+  },
+  {
+    name: 'isCompletionEnabled',
+    fn: isCompletionEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_COMPLETION_FLAG],
+  },
+  {
+    name: 'isDesignEvaluationEnabled',
+    fn: isDesignEvaluationEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_DESIGN_EVALUATION_FLAG],
+  },
+  {
+    name: 'isLiveSessionsEnabled',
+    fn: isLiveSessionsEnabled,
+    requires: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG],
+  },
+  {
+    // Live-dependent: master + live-sessions + its own sub-flag.
+    name: 'isVoiceInputEnabled',
+    fn: isVoiceInputEnabled,
+    requires: [
+      APP_QUESTIONNAIRES_FLAG,
+      APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+      APP_QUESTIONNAIRES_VOICE_INPUT_FLAG,
+    ],
+  },
+  {
+    name: 'isAttachmentInputEnabled',
+    fn: isAttachmentInputEnabled,
+    requires: [
+      APP_QUESTIONNAIRES_FLAG,
+      APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+      APP_QUESTIONNAIRES_ATTACHMENT_INPUT_FLAG,
+    ],
+  },
+  {
+    name: 'isCostCapEnforcementEnabled',
+    fn: isCostCapEnforcementEnabled,
+    requires: [
+      APP_QUESTIONNAIRES_FLAG,
+      APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+      APP_QUESTIONNAIRES_COST_CAP_FLAG,
+    ],
+  },
+];
 
-      const result = await isQuestionnairesEnabled();
+describe('sub-flag resolvers — truth tables', () => {
+  for (const { name, fn, requires } of SUB_FLAG_RESOLVERS) {
+    describe(name, () => {
+      it('is true when all required flags are on', async () => {
+        setFlags(Object.fromEntries(requires.map((f) => [f, true])));
+        await expect(fn()).resolves.toBe(true);
+      });
 
-      expect(result).toBe(true);
-      expect(mockedIsFeatureEnabled).toHaveBeenCalledWith(APP_QUESTIONNAIRES_FLAG);
-      // Guard the exact flag name — the seed and any external toggling rely on it.
-      expect(APP_QUESTIONNAIRES_FLAG).toBe('APP_QUESTIONNAIRES_ENABLED');
+      // One test per required flag: that flag off, every other required flag on → false.
+      for (const off of requires) {
+        const label =
+          off === APP_QUESTIONNAIRES_FLAG
+            ? 'master'
+            : off === APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG
+              ? 'live-sessions'
+              : 'its own sub-flag';
+        it(`is false when ${label} (${off}) is off`, async () => {
+          const flags = Object.fromEntries(requires.map((f) => [f, true]));
+          flags[off] = false;
+          setFlags(flags);
+          await expect(fn()).resolves.toBe(false);
+        });
+      }
     });
+  }
+});
+
+describe('sub-flag independence — one flag off suppresses only its own surface', () => {
+  // Turning a single sub-flag off must NOT affect any sibling resolver. We flip each
+  // sub-flag off (master + everything else on) and assert exactly that resolver goes false
+  // while the others stay true — the "rest of the platform unaffected" guarantee.
+  const INDEPENDENT_PAIRS: ReadonlyArray<{
+    flag: string;
+    resolver: () => Promise<boolean>;
+  }> = [
+    { flag: APP_QUESTIONNAIRES_ADAPTIVE_FLAG, resolver: isAdaptiveSelectionEnabled },
+    { flag: APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG, resolver: isAnswerExtractionEnabled },
+    {
+      flag: APP_QUESTIONNAIRES_CONTRADICTION_DETECTION_FLAG,
+      resolver: isContradictionDetectionEnabled,
+    },
+    { flag: APP_QUESTIONNAIRES_ANSWER_REFINEMENT_FLAG, resolver: isAnswerRefinementEnabled },
+    { flag: APP_QUESTIONNAIRES_COMPLETION_FLAG, resolver: isCompletionEnabled },
+    { flag: APP_QUESTIONNAIRES_DESIGN_EVALUATION_FLAG, resolver: isDesignEvaluationEnabled },
+    { flag: APP_QUESTIONNAIRES_VOICE_INPUT_FLAG, resolver: isVoiceInputEnabled },
+    { flag: APP_QUESTIONNAIRES_ATTACHMENT_INPUT_FLAG, resolver: isAttachmentInputEnabled },
+    { flag: APP_QUESTIONNAIRES_COST_CAP_FLAG, resolver: isCostCapEnforcementEnabled },
+  ];
 
-    it('returns false when the flag is disabled', async () => {
-      mockedIsFeatureEnabled.mockResolvedValue(false);
+  for (const { flag, resolver } of INDEPENDENT_PAIRS) {
+    it(`${flag} off → that resolver false, every sibling still true`, async () => {
+      const flags = allOn();
+      flags[flag] = false;
+      setFlags(flags);
 
-      await expect(isQuestionnairesEnabled()).resolves.toBe(false);
+      await expect(resolver()).resolves.toBe(false);
+
+      // Every OTHER sub-flag resolver whose required flags are all still on stays true.
+      for (const sibling of SUB_FLAG_RESOLVERS) {
+        if (sibling.requires.includes(flag)) continue;
+        await expect(sibling.fn(), `${sibling.name} should be unaffected`).resolves.toBe(true);
+      }
     });
+  }
+
+  it('adaptive degrades independently of extraction (both are master-only children)', async () => {
+    // Concrete independence example: adaptive off, extraction on.
+    setFlags({
+      [APP_QUESTIONNAIRES_FLAG]: true,
+      [APP_QUESTIONNAIRES_ADAPTIVE_FLAG]: false,
+      [APP_QUESTIONNAIRES_ANSWER_EXTRACTION_FLAG]: true,
+    });
+    await expect(isAdaptiveSelectionEnabled()).resolves.toBe(false);
+    await expect(isAnswerExtractionEnabled()).resolves.toBe(true);
   });
+});
 
-  describe('ensureQuestionnairesEnabled', () => {
-    it('returns null (no gate) when the app is enabled', async () => {
-      mockedIsFeatureEnabled.mockResolvedValue(true);
+describe('live-sessions cascade — turning the parent off closes the live-dependent trio', () => {
+  it('live-sessions off ⇒ voice, attachment, and cost-cap all false even with their sub-flags on', async () => {
+    const flags = allOn();
+    flags[APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG] = false;
+    setFlags(flags);
+
+    await expect(isLiveSessionsEnabled()).resolves.toBe(false);
+    await expect(isVoiceInputEnabled()).resolves.toBe(false);
+    await expect(isAttachmentInputEnabled()).resolves.toBe(false);
+    await expect(isCostCapEnforcementEnabled()).resolves.toBe(false);
+  });
+
+  it('master off ⇒ every resolver false (transitive close)', async () => {
+    const flags = allOn();
+    flags[APP_QUESTIONNAIRES_FLAG] = false;
+    setFlags(flags);
+
+    await expect(isQuestionnairesEnabled()).resolves.toBe(false);
+    for (const { name, fn } of SUB_FLAG_RESOLVERS) {
+      // Label the assertion so a single regressing resolver is named rather than
+      // hidden behind whichever one the sequential loop reaches first.
+      await expect(fn(), `${name} should be false when master is off`).resolves.toBe(false);
+    }
+  });
+});
 
+/**
+ * Route-level gates: the `ensure*` wrappers a route calls first (before auth) so a disabled
+ * surface 404s rather than 401s. Per-route gating is additionally covered by each route's own
+ * `route.test.ts`; these pin the shared gate helpers' contract.
+ */
+describe('route gates — ensure* return a 404 envelope when off, null when on', () => {
+  async function expect404(res: Response | null): Promise<void> {
+    expect(res).not.toBeNull();
+    expect(res).toBeInstanceOf(Response);
+    expect(res?.status).toBe(404);
+    const body = await res?.json();
+    expect(body).toEqual({ success: false, error: { message: 'Not found', code: 'NOT_FOUND' } });
+  }
+
+  describe('ensureQuestionnairesEnabled', () => {
+    it('returns null (no gate) when the master flag is on', async () => {
+      setFlags({ [APP_QUESTIONNAIRES_FLAG]: true });
       await expect(ensureQuestionnairesEnabled()).resolves.toBeNull();
     });
+    it('returns a 404 NOT_FOUND envelope when the master flag is off', async () => {
+      setFlags({ [APP_QUESTIONNAIRES_FLAG]: false });
+      await expect404(await ensureQuestionnairesEnabled());
+    });
+  });
+
+  describe('ensureLiveSessionsEnabled', () => {
+    it('returns null when master + live-sessions are on', async () => {
+      setFlags({
+        [APP_QUESTIONNAIRES_FLAG]: true,
+        [APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG]: true,
+      });
+      await expect(ensureLiveSessionsEnabled()).resolves.toBeNull();
+    });
+    it('404s when live-sessions is off even though master is on', async () => {
+      setFlags({
+        [APP_QUESTIONNAIRES_FLAG]: true,
+        [APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG]: false,
+      });
+      await expect404(await ensureLiveSessionsEnabled());
+    });
+  });
+
+  describe('ensureVoiceInputEnabled', () => {
+    it('returns null when master + live-sessions + voice are on', async () => {
+      setFlags({
+        [APP_QUESTIONNAIRES_FLAG]: true,
+        [APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG]: true,
+        [APP_QUESTIONNAIRES_VOICE_INPUT_FLAG]: true,
+      });
+      await expect(ensureVoiceInputEnabled()).resolves.toBeNull();
+    });
+    it('404s when the voice sub-flag is off even though master + live-sessions are on', async () => {
+      setFlags({
+        [APP_QUESTIONNAIRES_FLAG]: true,
+        [APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG]: true,
+        [APP_QUESTIONNAIRES_VOICE_INPUT_FLAG]: false,
+      });
+      await expect404(await ensureVoiceInputEnabled());
+    });
+    it('404s when live-sessions is off even though master + voice are on', async () => {
+      // Voice is a three-way AND (master + live-sessions + voice); turning the live-sessions
+      // parent off must close the gate too, not just the voice sub-flag.
+      setFlags({
+        [APP_QUESTIONNAIRES_FLAG]: true,
+        [APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG]: false,
+        [APP_QUESTIONNAIRES_VOICE_INPUT_FLAG]: true,
+      });
+      await expect404(await ensureVoiceInputEnabled());
+    });
+  });
+});
+
+/**
+ * The `with*Enabled` HOC wrappers compose the flag gate with a route handler so the gate runs
+ * **before** anything else (auth, handler work) — the ordering that makes a disabled surface
+ * look like a missing route (404) rather than a 401. Each wrapper must (a) short-circuit to the
+ * gate's 404 Response without ever calling the handler when the flag is off, and (b) call the
+ * handler with the original `(request, context)` and forward its Response when the flag is on.
+ * These pin both arms; per-route wiring is additionally covered by each route's own test.
+ */
+describe('with* gate wrappers — run the flag gate before the handler', () => {
+  const request = new NextRequest('http://localhost:3000/api/v1/app/test');
+  const context = { params: Promise.resolve({}) };
 
-    it('returns a 404 NOT_FOUND envelope when the app is disabled', async () => {
-      mockedIsFeatureEnabled.mockResolvedValue(false);
+  type GateWrapper = <C>(
+    handler: (request: NextRequest, context: C) => Promise<Response>
+  ) => (request: NextRequest, context: C) => Promise<Response>;
 
-      const res = await ensureQuestionnairesEnabled();
+  const WRAPPERS: ReadonlyArray<{
+    name: string;
+    wrap: GateWrapper;
+    // Flags that must ALL be on for the gate to allow the handler through.
+    enableFlags: readonly string[];
+    // The flag to turn off (others on) to prove the gate blocks before the handler.
+    blockFlag: string;
+  }> = [
+    {
+      name: 'withQuestionnairesEnabled',
+      wrap: withQuestionnairesEnabled,
+      enableFlags: [APP_QUESTIONNAIRES_FLAG],
+      blockFlag: APP_QUESTIONNAIRES_FLAG,
+    },
+    {
+      name: 'withLiveSessionsEnabled',
+      wrap: withLiveSessionsEnabled,
+      enableFlags: [APP_QUESTIONNAIRES_FLAG, APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG],
+      blockFlag: APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+    },
+    {
+      name: 'withVoiceInputEnabled',
+      wrap: withVoiceInputEnabled,
+      enableFlags: [
+        APP_QUESTIONNAIRES_FLAG,
+        APP_QUESTIONNAIRES_LIVE_SESSIONS_FLAG,
+        APP_QUESTIONNAIRES_VOICE_INPUT_FLAG,
+      ],
+      blockFlag: APP_QUESTIONNAIRES_VOICE_INPUT_FLAG,
+    },
+  ];
 
-      expect(res).not.toBeNull();
-      expect(res).toBeInstanceOf(Response);
-      expect(res?.status).toBe(404);
+  for (const { name, wrap, enableFlags, blockFlag } of WRAPPERS) {
+    describe(name, () => {
+      it('calls the handler with the original request + context and forwards its Response when enabled', async () => {
+        setFlags(Object.fromEntries(enableFlags.map((f) => [f, true])));
+        const handlerResponse = new Response('ok');
+        const handler = vi.fn(
+          async (_request: NextRequest, _context: { params: Promise<Record<string, string>> }) =>
+            handlerResponse
+        );
 
-      const body = await res?.json();
-      expect(body).toEqual({
-        success: false,
-        error: { message: 'Not found', code: 'NOT_FOUND' },
+        const result = await wrap(handler)(request, context);
+
+        expect(handler).toHaveBeenCalledTimes(1);
+        expect(handler).toHaveBeenCalledWith(request, context);
+        expect(result).toBe(handlerResponse);
+      });
+
+      it('short-circuits to a 404 and never calls the handler when the gate flag is off', async () => {
+        const flags = Object.fromEntries(enableFlags.map((f) => [f, true]));
+        flags[blockFlag] = false;
+        setFlags(flags);
+        const handler = vi.fn(
+          async (_request: NextRequest, _context: { params: Promise<Record<string, string>> }) =>
+            new Response('ok')
+        );
+
+        const result = await wrap(handler)(request, context);
+
+        expect(handler).not.toHaveBeenCalled();
+        expect(result.status).toBe(404);
       });
     });
-  });
+  }
 });