Skip to content

AP-104: Ephemeral moderation, 24h auto-deletion, and noindex#155

Merged
isuttell merged 5 commits into
mainfrom
cursor/ephemeral-moderation-noindex-3595
Jun 2, 2026
Merged

AP-104: Ephemeral moderation, 24h auto-deletion, and noindex#155
isuttell merged 5 commits into
mainfrom
cursor/ephemeral-moderation-noindex-3595

Conversation

@isuttell

@isuttell isuttell commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Implements the ephemeral-tier anti-abuse slice: shortest auto-deletion, stronger async moderation, and crawler blocking for unclaimed workspaces.

Changes

  • Auto deletion (24h): Export EPHEMERAL_AUTO_DELETION_DAYS from packages/config; cap ephemeral upload TTL to one day; on publish, set artifacts.expires_at from workspace.auto_deletion_days so the existing hourly sweep expires ephemeral artifacts on schedule.
  • Safety scanner (ephemeral_tier): Ephemeral publishes enqueue scanner_id=ephemeral_tier (claimed tiers keep builtin_content). Jobs runs Llama Guard 3 on text, keeps built-in heuristics, and adds an advisory script_present_unclaimed warning. Async Cloudflare URL Scanner checks the public agent-view URL; a malicious verdict creates an artifact-scoped Platform Lockdown and denylist entry.
  • Noindex: Ephemeral content tokens carry noindex: true; the content worker sets X-Robots-Tag: noindex, nofollow and injects a robots meta tag into served HTML.

Ops / secrets

Configure on jobs (documented in apps/jobs/README.md):

  • Workers AI binding AI (wrangler)
  • URL_SCANNER_API_TOKEN and CLOUDFLARE_ACCOUNT_ID via wrangler secret put
  • API_BASE_URL for URL Scanner target URLs

Verification

  • pnpm verify — pass
  • pnpm test:coverage — pass (branch coverage ≥ 80%)
  • pnpm smoke:local — fails in this environment with database_unavailable during CLI publish (harness provision succeeds but publish cannot reach DB; likely environment/harness, not introduced by this diff)

Linear Issue: AP-104

Open in Web Open in Cursor 

Summary by CodeRabbit

Release Notes

  • New Features
    • Introduced ephemeral content tier with automatic one-day auto-deletion for temporary artifacts.
    • Added search engine indexing prevention for ephemeral content through robots meta tags and response headers.
    • Enabled enhanced safety scanner for ephemeral-tier content including malicious URL detection.
    • Implemented artifact lockdown and denylist protection for detected malicious URLs.

- Route ephemeral publishes to scanner_id ephemeral_tier with Llama Guard 3,
  script-present advisory warnings, and async Cloudflare URL Scanner verdicts
- Apply artifact Platform Lockdown plus denylist when URL Scanner is malicious
- Cap ephemeral upload TTL and set publish expiry from workspace auto_deletion_days
- Mint content tokens with noindex and emit X-Robots-Tag plus HTML meta on responses
- Export EPHEMERAL_AUTO_DELETION_DAYS from packages/config (24h)
@linear-code

linear-code Bot commented Jun 2, 2026

Copy link
Copy Markdown
AP-104 Ephemeral-tier moderation, short Auto Deletion, and noindex

Parent: AP-99. Blocked by AP-100 (tier state). Builds on the existing Safety Scanner seam — no new scanner machinery.

Outcome

Ephemeral content gets the strictest treatment: shortest Auto Deletion, noindex/nofollow, and stronger moderation (Workers AI Llama Guard 3 on text + async Cloudflare URL Scanner verdict on the published URL) plugged in under a new scanner_id. A malicious verdict drives Platform Lockdown.

Context docs

  • docs/specs/ephemeral-publish.md (Anti-Abuse Stack)
  • docs/adr/0075-...md, docs/adr/0051-safety-scanner-lifecycle.md (scanner seam)
  • docs/adr/0056-mvp-usage-policy-defaults-and-platform-caps.md (row 20: ephemeral Auto Deletion)
  • docs/adr/0048-transient-artifacts-by-default.md (Auto Deletion), 0040 (Platform Lockdown), 0032 (jobs topology)

Likely files / packages

  • apps/jobs (safety-scan consumer: ephemeral scanner_id rules; URL Scanner call)
  • packages/config (ephemeral Auto Deletion value)
  • apps/content and/or renderer (noindex/nofollow headers + meta on ephemeral responses)
  • Workers AI binding + Cloudflare URL Scanner API token wiring

In scope

  • Ephemeral scanner_id rule set running Llama Guard 3 (text) at scan time.
  • Async URL Scanner verdict on the published ephemeral URL; malicious -> Platform Lockdown (ADR 0040).
  • Shortest Auto Deletion for ephemeral tier (24h) honored by the existing jobs sweep.
  • noindex/nofollow (header + meta) on ephemeral content responses.
  • Optional: advisory "script present, dormant until claimed" Safety Warning (the bit from AP-102).

Out of scope

  • Replacing the claimed-tier scanner (AP-33 covers general scanner integration).
  • Read rate limiting (unchanged).
  • Image moderation — see Resolved note below; text-only for this slice.

Acceptance criteria

  • Ephemeral artifacts carry the shortest Auto Deletion (24h) and are swept on schedule.
  • Ephemeral responses carry noindex/nofollow.
  • Llama Guard 3 runs on ephemeral text under its own scanner_id (advisory, REPLACE-on-scan per ADR 0051); existing claimed-tier warnings untouched.
  • A malicious URL Scanner verdict triggers Platform Lockdown and stops link resolution.

Required checks

pnpm verify, pnpm test:coverage, pnpm smoke:local.

Security / operational invariants

  • Moderation is advisory -> lockdown, never a hard publish block beyond the lockdown path.
  • Workers AI / URL Scanner API tokens are runtime secrets (wrangler secret put), never committed.
  • Scanner failures stay quiet per ADR 0051 (DLQ alert), do not block.

Dependencies / blockers

Blocked by AP-100. (Was needs-info — both resolved: ephemeral Auto Deletion = 24h (docs/adr/0056 row 20); text-only moderation this slice — Llama Guard 3 text is confirmed available, image moderation is out of scope and deferred until a Workers AI image classifier is confirmed in the catalog.)

Predicted file footprint / overlap

apps/jobs + packages/config + apps/content headers. Low overlap with the schema/route slices; the packages/config value overlaps slice 5's config edits — coordinate the config touch.

Review in Linear

@coderabbitai

coderabbitai Bot commented Jun 2, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9be5a488-329d-4ca7-ab6d-bcff7ebd1d79

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cursor/ephemeral-moderation-noindex-3595

Comment @coderabbitai help to get the list of available commands and usage tips.

@isuttell isuttell marked this pull request as ready for review June 2, 2026 01:56

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
apps/jobs/src/safety/ephemeral-scanner.ts (1)

10-10: ⚡ Quick win

Update Llama Guard parsing: Workers AI returns { response: string } only; safe===false is unsupported

@cf/meta/llama-guard-3-8b is a valid Workers AI model id, and env.AI.run(...) returns generated text in { response: string }. The docs describe Llama Guard as emitting plain text indicating safe vs unsafe (and category labels like S1, S2, … when unsafe), with no documented structured boolean field (so response.safe === false is likely dead unless you enable structured/JSON output). The current string includes("unsafe") path should be the one to rely on (or parse the category labels from the returned text).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jobs/src/safety/ephemeral-scanner.ts` at line 10, The code currently
assumes the Llama Guard output has a structured boolean (e.g., checking
response.safe), but Workers AI returns plain text in { response: string } for
model LLAMA_GUARD_MODEL; update the logic that calls env.AI.run(...) (the
LLAMA_GUARD_MODEL call) to read the returned object's response string (e.g.,
result.response) and determine safety by checking the text (use
response.includes("unsafe") or parse category tokens like "S1","S2", etc.)
instead of relying on a response.safe boolean; remove or guard any branches that
expect response.safe === false and ensure downstream variables/flags are set
based on the string check.
apps/jobs/src/handlers/safety-scan.ts (1)

98-105: ⚡ Quick win

Ephemeral enforcement runs synchronously in the consumer and re-runs on any non-quiet failure.

scanPublishedUrlMalicious swallows its own errors, but mintAgentViewUrl, verifyAgentViewToken, and applyMaliciousUrlLockdown are not wrapped. If any throws, it bubbles to the batch catch and triggers message.retry(), which re-executes scanner.scan (the paid Llama Guard text classification) and the full URL-scan/poll on the next delivery even though warnings were already written via REPLACE-on-scan. Combined with the synchronous poll budget, each ephemeral message can block the consumer for several seconds.

Consider isolating the URL-scan/lockdown step in its own try/catch (logging on failure) so enforcement failures are decoupled from the warning-write path, and decide explicitly whether a lockdown failure should retry the whole message or fail-quiet per ADR 0051.

Also applies to: 329-343

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/jobs/src/handlers/safety-scan.ts` around lines 98 - 105, The ephemeral
enforcement path currently allows exceptions from mintAgentViewUrl,
verifyAgentViewToken, applyMaliciousUrlLockdown (called inside
runEphemeralUrlScanner / scanPublishedUrlMalicious) to bubble up and trigger
message.retry() causing unnecessary re-runs of scanner.scan; wrap the entire
URL-scan/lockdown sequence — specifically the calls to
scanPublishedUrlMalicious, mintAgentViewUrl, verifyAgentViewToken, and
applyMaliciousUrlLockdown invoked by runEphemeralUrlScanner — in its own
try/catch so failures are logged (with context) and do not abort the main
warning/write path; decide and implement the retry policy per ADR 0051 (either
fail-quiet or requeue) inside that catch and ensure any non-retryable failures
do not rethrow to the batch consumer.
apps/api/src/routes/revisions.ts (1)

95-113: 💤 Low value

Consider extracting the ephemeral_tier check to avoid duplication.

The same ephemeral tier extraction logic appears twice (lines 95-99 and 109-113). While the defensive type guards are appropriate, extracting this to a local variable would improve maintainability.

♻️ Suggested refactor
       const bundleStatus = bundleStatusFromPublishResult(result);
+      const ephemeralTier =
+        result !== null &&
+        typeof result === "object" &&
+        "ephemeral_tier" in result &&
+        result.ephemeral_tier === true;
       try {
         await enqueuePostPublishJobs(context.env, {
           workspaceId: actor.workspace_id,
           artifactId: params.artifactId ?? "",
           revisionId: params.revisionId ?? "",
           bundleStatus: bundleStatus === "pending" ? "pending" : "disabled",
           requestedAt: now,
-          ephemeralTier:
-            result !== null &&
-            typeof result === "object" &&
-            "ephemeral_tier" in result &&
-            result.ephemeral_tier === true,
+          ephemeralTier,
         });
       } catch (error) {
         console.warn("Post-publish job enqueue failed after publish; revision remains published.", {
           artifactId: params.artifactId ?? "",
           revisionId: params.revisionId ?? "",
           bundleStatus,
           error: error instanceof Error ? error.message : String(error),
         });
       }
       const signed = await signPublishResult(result, context.env, {
         workspaceId: actor.workspace_id,
-        ephemeralTier:
-          result !== null && typeof result === "object" && "ephemeral_tier" in result && result.ephemeral_tier === true,
+        ephemeralTier,
       });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/api/src/routes/revisions.ts` around lines 95 - 113, The duplicated
defensive check for result.ephemeral_tier should be extracted to a single local
boolean and reused; create a const (e.g., ephemeralTier) computed once with the
existing guards (result !== null && typeof result === "object" &&
"ephemeral_tier" in result && result.ephemeral_tier === true) and replace both
inline checks (the enqueue warning block and the signPublishResult call) with
that variable so the logic is defined in one place and easier to maintain.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@apps/jobs/src/safety/url-scanner.ts`:
- Around line 16-17: The polling window for Cloudflare URL scans is too short
causing scanPublishedUrlMalicious to return "unknown" before a finish and
preventing runEphemeralUrlScanner from applying lockdown; increase
SCAN_POLL_ATTEMPTS and/or SCAN_POLL_DELAY_MS (e.g., poll for a longer total
duration like 10–30s per Cloudflare guidance) and ensure the loop in
scanPublishedUrlMalicious retries until task.status === "Finished" or timeout,
leaving verdict logic in runEphemeralUrlScanner unchanged so it receives a
definitive "malicious" when available; update the constants SCAN_POLL_ATTEMPTS
and SCAN_POLL_DELAY_MS (and any related loop/timeout logic) to extend the
polling window.

---

Nitpick comments:
In `@apps/api/src/routes/revisions.ts`:
- Around line 95-113: The duplicated defensive check for result.ephemeral_tier
should be extracted to a single local boolean and reused; create a const (e.g.,
ephemeralTier) computed once with the existing guards (result !== null && typeof
result === "object" && "ephemeral_tier" in result && result.ephemeral_tier ===
true) and replace both inline checks (the enqueue warning block and the
signPublishResult call) with that variable so the logic is defined in one place
and easier to maintain.

In `@apps/jobs/src/handlers/safety-scan.ts`:
- Around line 98-105: The ephemeral enforcement path currently allows exceptions
from mintAgentViewUrl, verifyAgentViewToken, applyMaliciousUrlLockdown (called
inside runEphemeralUrlScanner / scanPublishedUrlMalicious) to bubble up and
trigger message.retry() causing unnecessary re-runs of scanner.scan; wrap the
entire URL-scan/lockdown sequence — specifically the calls to
scanPublishedUrlMalicious, mintAgentViewUrl, verifyAgentViewToken, and
applyMaliciousUrlLockdown invoked by runEphemeralUrlScanner — in its own
try/catch so failures are logged (with context) and do not abort the main
warning/write path; decide and implement the retry policy per ADR 0051 (either
fail-quiet or requeue) inside that catch and ensure any non-retryable failures
do not rethrow to the batch consumer.

In `@apps/jobs/src/safety/ephemeral-scanner.ts`:
- Line 10: The code currently assumes the Llama Guard output has a structured
boolean (e.g., checking response.safe), but Workers AI returns plain text in {
response: string } for model LLAMA_GUARD_MODEL; update the logic that calls
env.AI.run(...) (the LLAMA_GUARD_MODEL call) to read the returned object's
response string (e.g., result.response) and determine safety by checking the
text (use response.includes("unsafe") or parse category tokens like "S1","S2",
etc.) instead of relying on a response.safe boolean; remove or guard any
branches that expect response.safe === false and ensure downstream
variables/flags are set based on the string check.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 2b80a0d4-5db3-4cef-a801-68dedd9c2fd9

📥 Commits

Reviewing files that changed from the base of the PR and between 054e888 and d45925b.

📒 Files selected for processing (37)
  • apps/api/src/agent-view-html.ts
  • apps/api/src/agent-view.ts
  • apps/api/src/index.test.ts
  • apps/api/src/post-publish.test.ts
  • apps/api/src/post-publish.ts
  • apps/api/src/routes/revisions.ts
  • apps/api/test/route-core.test.ts
  • apps/content/src/index.test.ts
  • apps/content/src/index.ts
  • apps/jobs/README.md
  • apps/jobs/src/env.ts
  • apps/jobs/src/handlers/safety-scan.integration.test.ts
  • apps/jobs/src/handlers/safety-scan.ts
  • apps/jobs/src/safety/ephemeral-scanner.test.ts
  • apps/jobs/src/safety/ephemeral-scanner.ts
  • apps/jobs/src/safety/platform-lockdown.test.ts
  • apps/jobs/src/safety/platform-lockdown.ts
  • apps/jobs/src/safety/resolve-scanner.test.ts
  • apps/jobs/src/safety/resolve-scanner.ts
  • apps/jobs/src/safety/url-scanner.test.ts
  • apps/jobs/src/safety/url-scanner.ts
  • apps/jobs/wrangler.jsonc
  • packages/config/src/index.test.ts
  • packages/config/src/index.ts
  • packages/contracts/src/jobs.ts
  • packages/db/src/agent-view.ts
  • packages/db/src/artifact-invalidation.ts
  • packages/db/src/index.ts
  • packages/db/src/policy-ephemeral.test.ts
  • packages/db/src/policy.ts
  • packages/db/src/repository/upload-session-lifecycle.ts
  • packages/db/src/repository/workflows/ephemeral-workflow.test.ts
  • packages/db/src/repository/workflows/ephemeral-workflow.ts
  • packages/db/src/repository/workflows/upload-publish-workflow.ts
  • packages/db/src/resolve-access-link.ts
  • packages/tokens/src/content.test.ts
  • packages/tokens/src/content.ts

Comment thread apps/jobs/src/safety/url-scanner.ts
@isuttell isuttell merged commit 240c4cd into main Jun 2, 2026
5 checks passed
@isuttell isuttell deleted the cursor/ephemeral-moderation-noindex-3595 branch June 2, 2026 03:42
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

agent-paste PR preview resources were cleaned up. The shared Preview GitHub Environment is retained for future preview deploys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants