Skip to content

feat: operator platform lockdown endpoints behind requireOperator()#45

Merged
isuttell merged 4 commits into
mainfrom
agents/web-admin-lockdown
May 24, 2026
Merged

feat: operator platform lockdown endpoints behind requireOperator()#45
isuttell merged 4 commits into
mainfrom
agents/web-admin-lockdown

Conversation

@isuttell

@isuttell isuttell commented May 24, 2026

Copy link
Copy Markdown
Contributor

What

Implements backlog item #3 (docs/ops/project-status.md) and the /v1/web/admin/lockdown line in docs/ops/web-app-todo.md: operator-only platform lockdown endpoints behind a new requireOperator() guard, per ADR 0040 / 0046 / 0057.

  • POST /v1/web/admin/lockdowns — set a reversible lockdown ({ scope: "workspace"|"artifact", target_id, reason_code }).
  • DELETE /v1/web/admin/lockdowns/{scope}/{target_id} — lift it.

How

  • New operator auth requirement + resolver accepting exactly two identities: (1) a WorkOS session whose verified email is in OPERATOR_EMAILS (case-folded), or (2) a Cloudflare Access service-token JWT (RS256, aud-checked, common_name required so human Access JWTs are rejected). Every auth failure — including API-key bearers and non-operator emails — collapses to a generic not_found (404), keeping the admin surface non-enumerable (ADR 0046). API keys never reach the resolver because the route declares auth: "operator".
  • New platform actor type across packages/contracts, packages/db, and packages/commands. Migration 0008 widens both actor_type CHECK constraints and creates platform_lockdowns (partial unique index enforcing one effective/un-lifted row per (scope, target_id), RLS enabled + forced, platform-scoped policy on app.platform).
  • setLockdown/liftLockdown run through runCommand under platform scope (audit events platform.lockdown.set / platform.lockdown.lifted). KV denylist wsd:/ad: keys are written on set and deleted on lift, after the Postgres commit, best-effort/fail-open (the lockdown is already durable), matching the cleanup path.
  • Operator principals rate-limit by platform:{id} with no workspace dimension.

Scope notes

  • OPERATOR_EMAILS is treated as a secret (mirrors ADMIN_TOKEN, not added as a wrangler.jsonc var). CF_ACCESS_TEAM_DOMAIN / CF_ACCESS_AUD are added as empty vars placeholders; when unset, the service-token path is simply unavailable, so local/preview without Cloudflare Access still works via the WorkOS-operator path.
  • No GET list endpoint in this slice — deferred follow-up tracked in web-app-todo.md.
  • Pre-launch product: no back-compat shims.

Verification

  • pnpm verify — 70/70 green.
  • pnpm --filter @agent-paste/db db:check — snapshot matches schema.
  • pnpm openapi:check — golden regenerated and green.
  • pnpm smoke:local — green.
  • Tests cover both operator sources (allow), reject paths (API key → 404, non-operator email → 404, missing auth → 404, human Access JWT without common_name → 404, invalid lift scope → 404), idempotent replay, KV put/delete assertions, and repository-level lockdown lifecycle (effective-row uniqueness, lift-of-nonexistent → not_found).
  • Reviewed locally with CodeRabbit (two passes); findings fixed or declined with rationale.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Operator-only admin endpoints to set and lift platform lockdowns (workspace & artifact), with idempotent behavior, auditable records, and operator authentication via WorkOS or Cloudflare Access.
  • Platform & Security

    • New platform actor type and integrated denylist (KV) writes/removals tied to lockdown lifecycle; operator requests subject to actor rate limits.
  • Database

    • Persistent platform lockdowns with uniqueness for active locks and RLS protections.
  • Tests & Docs

    • Added tests for operator auth and lockdown flows; updated OpenAPI and operational docs.

Review Change Stack

isuttell and others added 2 commits May 24, 2026 12:21
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
POST /v1/web/admin/lockdowns and DELETE /v1/web/admin/lockdowns/{scope}/{target_id}
set and lift reversible platform lockdowns, per ADR 0040/0046/0057.

- New `operator` auth requirement + resolver accepting exactly two identities:
  a WorkOS session whose verified email is in OPERATOR_EMAILS, or a Cloudflare
  Access service-token JWT (RS256, aud-checked, common_name required). Every
  auth failure (incl. API-key bearer, non-operator email) collapses to a
  generic not_found (404) so the surface stays non-enumerable.
- New `platform` actor type across contracts/db/commands; migration 0008 widens
  both actor_type CHECKs and creates `platform_lockdowns` (partial unique index
  on effective rows, forced RLS via app.platform).
- setLockdown/liftLockdown run through runCommand; KV denylist wsd:/ad: keys are
  written on set and deleted on lift, after the Postgres commit (best-effort).
- Operator principals rate-limit by platform:{id} with no workspace dimension.

GET list endpoint deferred (tracked in web-app-todo.md). pnpm verify 70/70,
db:check + openapi:check green, smoke:local green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@isuttell isuttell temporarily deployed to pr-preview-45 May 24, 2026 20:11 — with GitHub Actions Inactive
@coderabbitai

coderabbitai Bot commented May 24, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 1acda19f-f7fe-466b-9315-f33c556183df

📥 Commits

Reviewing files that changed from the base of the PR and between 5e6060c and 2008f0f.

⛔ Files ignored due to path filters (1)
  • .claude/scheduled_tasks.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • packages/db/src/repository/core.ts

Walkthrough

This PR implements reversible platform lockdowns for web admins with operator authentication via WorkOS email allowlist or Cloudflare Access service tokens. It adds Zod contracts and OpenAPI endpoints, a new platform_lockdowns table and migration, repository methods (setLockdown, liftLockdown) with idempotency and operation events, Drizzle queries and local in-memory support, API handlers that update a denylist KV (best-effort), operator principal and rate-limit handling, and tests plus wrangler env var and docs updates.

Sequence Diagram

sequenceDiagram
  participant Operator
  participant APIAuth as API Auth Resolver
  participant Handler as Lockdown Handler
  participant Repo as Repository
  participant DB as PostgreSQL
  participant KV as Worker KV

  Operator->>APIAuth: Request (WorkOS session or Cf-Access JWT)
  APIAuth->>APIAuth: getOperatorEmails / verifyCfAccessServiceToken
  APIAuth-->>Handler: OperatorPrincipal or not_found

  Handler->>Handler: Parse and validate request
  Handler->>Repo: setLockdown(actor, idempotencyKey, scope, targetId, reason)
  Repo->>DB: findEffective(scope,targetId)
  alt exists
    Repo-->>Handler: return existing LockdownDetail
  else
    Repo->>DB: insert platform_lockdown row
    Repo->>DB: insert operation_events (actor=platform)
    Repo-->>Handler: return new LockdownDetail
  end

  Handler->>KV: write denylist key (best-effort)
  KV-->>Handler: success or logged failure
  Handler-->>Operator: 200 LockdownDetail
Loading

Possibly related PRs

  • zaks-io/agent-paste#4: Shares the API mutation idempotency/runIdempotent execution path used by the lockdown handlers.
  • zaks-io/agent-paste#41: Related to principal/rate-limit and worker auth infrastructure extended here for operator principals.
  • zaks-io/agent-paste#17: Schema/snapshot changes around platform_lockdowns and schema snapshot comparisons.

"🐰
I nibbled keys in KV night and day,
Operators hop to keep trouble at bay,
Rows set, then lifted, in tidy array,
A rabbit cheers for locks that safely sway."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.51% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: adding operator-only platform lockdown endpoints with the operator authentication requirement.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch agents/web-admin-lockdown

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/contracts/openapi/api.json (1)

1085-1100: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the new platform lockdown action values to both event enums.

The PR introduces platform.lockdown.set and platform.lockdown.lifted, but both OperationEvent.action enums still exclude them. Any client or schema validation generated from this OpenAPI doc will reject valid audit responses once platform lockdown events are returned.

🧩 Proposed fix
           "action": {
             "type": "string",
             "enum": [
               "workspace.created",
               "api_key.created",
               "api_key.revoked",
               "upload_session.created",
               "upload_session.finalized",
               "upload_session.expired",
               "upload_session.failed",
               "artifact.published",
               "artifact.deleted",
               "artifact.expired",
+              "platform.lockdown.set",
+              "platform.lockdown.lifted",
               "cleanup.run",
               "admin.destructive_operation"
             ]
           },

Apply the same addition in both OperationEvent and OperationEventListResponse.

Also applies to: 1179-1194

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/contracts/openapi/api.json` around lines 1085 - 1100, OperationEvent
and OperationEventListResponse currently omit the new actions, causing schema
validation failures; update the "action" enum for both OperationEvent and
OperationEventListResponse to include "platform.lockdown.set" and
"platform.lockdown.lifted" so generated clients accept those audit events—locate
the "action" enum within the OperationEvent definition and the corresponding
"action" enum in OperationEventListResponse and append the two new string
values.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/api/src/index.test.ts`:
- Around line 1426-1428: The test stub for liftLockdown is missing an assertion
for the caller identity; update the expectation in the async liftLockdown(input)
mock to include the actor field (e.g. expect(input).toMatchObject({ scope:
"artifact", targetId: "art_9", idempotencyKey: "lift-1", actor: /* expected
actor object or id */ })), and ensure the mock still returns lockdownDetail(...)
as before so the test verifies the platform actor is passed through to
liftLockdown.

In `@apps/api/src/index.ts`:
- Around line 57-61: The KVNamespace type currently defines delete? as optional
which can prevent DENYLIST rollback; make delete required by removing the
optional marker so the signature is delete(key: string): Promise<void>; update
any other occurrences of the optional delete definition (the other KVNamespace
declarations mentioned) to the same required signature and ensure callers assume
delete exists (or handle absence at type-check time) so denylist entries are
always removable during lockdown lift.
- Around line 979-981: The returned commonName is not normalized, so trim and
lowercase it before returning to ensure stable operator identity; locate the
block that checks the commonName variable (the `if (commonName) { return
commonName; }` branch) and change it to return a normalized value using
commonName.trim().toLowerCase() so Cloudflare operator IDs are consistent across
audit/idempotency/rate-limit paths.

In `@apps/api/src/operator.ts`:
- Around line 48-50: Normalize the extracted payload.common_name before
returning it as the operator identity: when reading payload.common_name into
commonName, trim whitespace and convert to lowercase, then validate that the
trimmed lowercase string has length > 0 and return that normalized value
(otherwise return null); update the logic around the commonName variable in the
function that currently returns commonName (referencing payload.common_name and
commonName) so all callers use the normalized identity.

In `@packages/db/src/index.test.ts`:
- Around line 439-445: Add an assertion that the lifted event contains the same
actor metadata as the set event: after computing liftEvents (from
repo.operationEvents.values() filtered by action ===
"platform.lockdown.lifted"), add an expectation like
expect(liftEvents[0]).toMatchObject({ actor_type: "platform", actor_id:
"operator@example.com" }) to mirror the existing checks on setEvents (ensure you
reference the liftEvents variable and its first element).

In `@packages/db/src/repository/core.ts`:
- Around line 569-584: Concurrent requests can race between the existing check
and insert in setLockdown causing a unique-constraint error; wrap the await
entities.platformLockdowns.insert(...) in a try/catch, detect the
unique-conflict (DB constraint) on insert, and on that error re-query
entities.platformLockdowns.findEffective(input.scope, input.targetId) and return
toLockdownDetail(existing) if found, otherwise rethrow the error; reference
symbols: setLockdown, entities.platformLockdowns.insert,
entities.platformLockdowns.findEffective, toLockdownDetail.
- Around line 620-631: The code always emits an audit event and returns success
after calling entities.platformLockdowns.markLifted even though markLifted may
no-op under a race; change liftLockdown to inspect the result of
entities.platformLockdowns.markLifted (e.g., returned row/count/boolean) and
only call entities.operationEvents.insert and return toLockdownDetail when
markLifted actually updated a row; if markLifted did not update anything, return
an appropriate not-modified/error response (or re-fetch existing state) instead
of emitting the lifted event. Ensure you update the conditional logic around
entities.platformLockdowns.markLifted, entities.operationEvents.insert, and
toLockdownDetail so the audit event and success response are produced only on a
real state change.

---

Outside diff comments:
In `@packages/contracts/openapi/api.json`:
- Around line 1085-1100: OperationEvent and OperationEventListResponse currently
omit the new actions, causing schema validation failures; update the "action"
enum for both OperationEvent and OperationEventListResponse to include
"platform.lockdown.set" and "platform.lockdown.lifted" so generated clients
accept those audit events—locate the "action" enum within the OperationEvent
definition and the corresponding "action" enum in OperationEventListResponse and
append the two new string values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: b5fba9df-ba62-41de-92b5-fa6a81f45b34

📥 Commits

Reviewing files that changed from the base of the PR and between 0ba041d and e3ded92.

📒 Files selected for processing (35)
  • apps/api/src/index.test.ts
  • apps/api/src/index.ts
  • apps/api/src/operator.test.ts
  • apps/api/src/operator.ts
  • apps/api/wrangler.jsonc
  • docs/ops/project-status.md
  • docs/ops/web-app-todo.md
  • packages/commands/src/index.ts
  • packages/contracts/openapi/api.json
  • packages/contracts/src/enums.ts
  • packages/contracts/src/index.ts
  • packages/contracts/src/lockdown.ts
  • packages/contracts/src/mvp-contracts.test.ts
  • packages/contracts/src/openapi/api.ts
  • packages/contracts/src/openapi/shared.ts
  • packages/contracts/src/routes.ts
  • packages/db/migrations/0008_platform_actor_lockdowns.sql
  • packages/db/snapshot/schema.sql
  • packages/db/src/index.test.ts
  • packages/db/src/index.ts
  • packages/db/src/local-repository.ts
  • packages/db/src/queries/index.ts
  • packages/db/src/queries/operation-events.ts
  • packages/db/src/queries/platform-lockdowns.ts
  • packages/db/src/repository/core.ts
  • packages/db/src/repository/interface.ts
  • packages/db/src/repository/local-entities.ts
  • packages/db/src/repository/local-state.ts
  • packages/db/src/repository/ports.ts
  • packages/db/src/repository/postgres-entities.ts
  • packages/db/src/schema.ts
  • packages/db/src/types.ts
  • packages/worker-runtime/src/index.ts
  • packages/worker-runtime/src/principal.ts
  • packages/worker-runtime/src/rate-limit.ts

Comment thread apps/api/src/index.test.ts
Comment thread apps/api/src/index.ts
Comment thread apps/api/src/index.ts
Comment thread apps/api/src/operator.ts
Comment thread packages/db/src/index.test.ts
Comment thread packages/db/src/repository/core.ts
Comment thread packages/db/src/repository/core.ts Outdated
Address CodeRabbit review on PR #45:
- Make KVNamespace.delete required so lockdown reversibility cannot be
  silently dropped; deleteDenylistEntry guards on env.DENYLIST presence.
- platformLockdowns.insert returns a boolean via ON CONFLICT DO NOTHING
  RETURNING; setLockdown treats a lost partial-unique race as a replay
  instead of aborting the transaction.
- platformLockdowns.markLifted returns a boolean via RETURNING; liftLockdown
  throws not_found (and emits no audit event) when it loses a lift race.
- Normalize Cf-Access common_name (trim + lowercase) before use.
- Assert platform actor on the lift handler and lifted audit event.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@isuttell isuttell temporarily deployed to pr-preview-45 May 24, 2026 20:38 — with GitHub Actions Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/db/src/repository/core.ts`:
- Around line 585-591: When entities.platformLockdowns.insert(lockdown) returns
false and entities.platformLockdowns.findEffective(input.scope, input.targetId)
also returns no winner, do not proceed to emit the "platform.lockdown.set" event
or return a success; instead surface an error/failed response so we don't report
a lockdown that wasn't persisted. Update the control flow in the block handling
inserted (around entities.platformLockdowns.insert, findEffective and
toLockdownDetail) to explicitly throw or return an error when inserted === false
and winner == null, and only emit the platform.lockdown.set event and a success
response when insert succeeded or a valid winner was found and converted via
toLockdownDetail.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 4a6eae70-f545-4c23-b059-831b3b706263

📥 Commits

Reviewing files that changed from the base of the PR and between e3ded92 and 5e6060c.

📒 Files selected for processing (8)
  • apps/api/src/index.test.ts
  • apps/api/src/index.ts
  • apps/api/src/operator.ts
  • packages/db/src/index.test.ts
  • packages/db/src/queries/platform-lockdowns.ts
  • packages/db/src/repository/core.ts
  • packages/db/src/repository/local-entities.ts
  • packages/db/src/repository/ports.ts

Comment thread packages/db/src/repository/core.ts
CodeRabbit re-review: when setLockdown's insert is rejected by the partial
unique index but no effective row can be found, throw instead of emitting a
misleading platform.lockdown.set audit event for a row that was never written.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@isuttell isuttell temporarily deployed to pr-preview-45 May 24, 2026 20:44 — with GitHub Actions Inactive
@isuttell isuttell merged commit dac72d1 into main May 24, 2026
4 checks passed
@isuttell isuttell deleted the agents/web-admin-lockdown branch May 24, 2026 20:47
@github-actions

Copy link
Copy Markdown

agent-paste PR preview resources were cleaned up. The pr-preview-${context.issue.number} environment is left in place; remove it from the GitHub UI if desired.

isuttell added a commit that referenced this pull request May 24, 2026
Snapshot now points at ad85175 (#46); record the operator lockdown (#45) and
web loader-wiring (#46) merges in Recently Completed; strike backlog #4 as done;
update the Phase 3 (~55%), web.md, ADR 0055 rows. Remaining Phase 3 code work is
CLI login (#5) and smoke:web (#6), both gated on the WorkOS/Access click-ops in
backlog item #1.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant