fix(api,upload): replay idempotent mutations before rate-limit accounting#9
Conversation
…ting Adds a SELECT-only peekIdempotentReplay helper to @agent-paste/commands and uses it in both upload mutation routes to short-circuit known-good retries before either rate-limit binding is touched. Hosted smoke now hammers POST /v1/upload-sessions with unique idempotency keys to confirm 429 with the rate_limited_actor envelope on preview targets.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR adds a peek-only idempotency query (peekIdempotentReplay) to the commands package, extends UploadDatabase with an optional peekIdempotentReplay, and wires a peekUploadReplay helper that delegates to DB or a Hyperdrive fallback. createUploadSession and finalizeUploadSession now check for cached completed results (by operation and Idempotency-Key) and return replayed payloads before applying rate limits. Tests assert replay returns without incrementing rate-limit counters, smoke scripts exercise actor-rate-limit 429 behavior, and docs are updated. Sequence Diagram(s)sequenceDiagram
participant Client
participant UploadHandler
participant peekUploadReplay
participant peekIdempotentReplay
participant idempotency_records_or_hyperdrive
Client->>UploadHandler: POST /upload-sessions (Idempotency-Key)
UploadHandler->>peekUploadReplay: peek(operation="upload.session.create", key, actor)
peekUploadReplay->>peekIdempotentReplay: delegate to DB peek or hyperdrive lookup
peekIdempotentReplay->>idempotency_records_or_hyperdrive: SELECT or hyperdrive read for idempotency record
alt completed replay exists
idempotency_records_or_hyperdrive-->>peekIdempotentReplay: { result_json }
peekUploadReplay-->>UploadHandler: replayed session payload
UploadHandler-->>Client: 200 { upload_session_id, put_url, ... }
else no replay
idempotency_records_or_hyperdrive-->>peekIdempotentReplay: null
UploadHandler->>UploadHandler: perform rate-limit checks and normal flow
UploadHandler-->>Client: 201 created (or other final response)
end
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/upload/src/index.test.ts`:
- Around line 107-168: Add a parallel test to index.test.ts that mirrors the
"replays cached idempotent result without consuming rate budget" case but
targets the finalize endpoint: call handleRequest with method POST to
"/v1/upload-sessions/:sessionId/finalize" (use session_id "upl_replay") and the
same env stubs, update DB.peekIdempotentReplay to return { result: session }
when operation === "upload.session.finalize" and idempotencyKey === "replay",
keep ACTOR_RATE_LIMIT and WORKSPACE_BURST_CAP.limit stubs returning { success:
false } and incrementing counters, then assert the response is 200, the returned
finalize payload matches the cached session/finalize result, and that
rateLimitCalls remains { actor: 0, workspace: 0 } so cached replay doesn't
consume rate budget.
In `@apps/upload/src/index.ts`:
- Around line 161-164: The replay branch returns a replayed UploadSessionRecord
from peekUploadReplay and then calls buildCreateSessionResponse which re-signs
upload URLs; change this so replayed responses cannot extend the original
session: read the original session expiry (e.g., expiresAt or ttl) from the
returned UploadSessionRecord in the peekUploadReplay flow and pass that expiry
into buildCreateSessionResponse (or into the URL signing routine such as the
sign/upload URL helper) so the signed URL TTL is computed as max(0,
originalExpiry - now) instead of using a fresh now+ttl; update
buildCreateSessionResponse or the signing helper to accept an optional expiry
cutoff and use it when re-signing in the upload.session.create replay path.
In `@scripts/smoke-hosted.mjs`:
- Around line 63-67: The rate-limit probe assertActorRateLimitFires(key.secret)
is run before subsequent publish checks and can cause 429s to affect later
assertions; reorder the non-production branch so assertActorRateLimitFires runs
last (after assertBytesPurgedAfterDelete(published) and
assertBytesPurgedAfterExpiry(userEnv, published)), ensuring the same key.secret
isn't rate-limited during the publish-based checks; apply the same reorder to
the analogous block around lines 271-297.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 36ab7a37-1904-4640-a270-7eaa117521e0
📒 Files selected for processing (5)
apps/upload/src/index.test.tsapps/upload/src/index.tspackages/commands/src/index.test.tspackages/commands/src/index.tsscripts/smoke-hosted.mjs
| it("replays cached idempotent result without consuming rate budget", async () => { | ||
| const session: UploadSessionRecord = { | ||
| session_id: "upl_replay", | ||
| artifact_id: "art_replay", | ||
| revision_id: "rev_replay", | ||
| expires_at: "2030-01-01T00:00:00.000Z", | ||
| files: [{ path: "index.html", size_bytes: 12 }], | ||
| }; | ||
| const rateLimitCalls = { actor: 0, workspace: 0 }; | ||
| const env: Env = { | ||
| UPLOAD_SIGNING_SECRET: "secret", | ||
| AUTH: { | ||
| async verifyApiKey() { | ||
| return { type: "api_key", id: "key_1", workspace_id: "w_1" }; | ||
| }, | ||
| }, | ||
| DB: { | ||
| async createUploadSession() { | ||
| throw new Error("replayed requests must not create new sessions"); | ||
| }, | ||
| async getUploadSession() { | ||
| return null; | ||
| }, | ||
| async finalizeUploadSession() { | ||
| return {}; | ||
| }, | ||
| async peekIdempotentReplay({ idempotencyKey, operation }) { | ||
| if (operation === "upload.session.create" && idempotencyKey === "replay") { | ||
| return { result: session }; | ||
| } | ||
| return null; | ||
| }, | ||
| }, | ||
| ACTOR_RATE_LIMIT: { | ||
| async limit() { | ||
| rateLimitCalls.actor += 1; | ||
| return { success: false }; | ||
| }, | ||
| }, | ||
| WORKSPACE_BURST_CAP: { | ||
| async limit() { | ||
| rateLimitCalls.workspace += 1; | ||
| return { success: false }; | ||
| }, | ||
| }, | ||
| }; | ||
|
|
||
| const response = await handleRequest( | ||
| new Request("https://upload.test/v1/upload-sessions", { | ||
| method: "POST", | ||
| headers: { authorization: "Bearer ok", "idempotency-key": "replay", "content-type": "application/json" }, | ||
| body: JSON.stringify({ files: [{ path: "index.html", size_bytes: 12 }] }), | ||
| }), | ||
| env, | ||
| ); | ||
|
|
||
| expect(response.status).toBe(200); | ||
| const body = (await response.json()) as { upload_session_id: string; files: Array<{ put_url: string }> }; | ||
| expect(body.upload_session_id).toBe("upl_replay"); | ||
| expect(body.files[0]?.put_url).toContain("upl_replay"); | ||
| expect(rateLimitCalls).toEqual({ actor: 0, workspace: 0 }); | ||
| }); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Add a matching replay test for finalize route.
This new test protects POST /v1/upload-sessions, but the PR also changed replay-before-rate-limit behavior for POST /v1/upload-sessions/:sessionId/finalize. Please add the same assertion pattern there (cached 200 response while actor/workspace limit stubs deny).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/upload/src/index.test.ts` around lines 107 - 168, Add a parallel test to
index.test.ts that mirrors the "replays cached idempotent result without
consuming rate budget" case but targets the finalize endpoint: call
handleRequest with method POST to "/v1/upload-sessions/:sessionId/finalize" (use
session_id "upl_replay") and the same env stubs, update DB.peekIdempotentReplay
to return { result: session } when operation === "upload.session.finalize" and
idempotencyKey === "replay", keep ACTOR_RATE_LIMIT and WORKSPACE_BURST_CAP.limit
stubs returning { success: false } and incrementing counters, then assert the
response is 200, the returned finalize payload matches the cached
session/finalize result, and that rateLimitCalls remains { actor: 0, workspace:
0 } so cached replay doesn't consume rate budget.
| const replay = await peekUploadReplay<UploadSessionRecord>(db, env, actor, "upload.session.create", idempotencyKey); | ||
| if (replay) { | ||
| return jsonResponse(await buildCreateSessionResponse(request, env, replay)); | ||
| } |
There was a problem hiding this comment.
Cap replay-signed upload URLs to session expiry.
Line 163 now serves replayed sessions repeatedly, and Line 219 re-signs URLs each time. Because signing currently uses now + ttl only, repeated replay calls can keep issuing fresh upload tokens even after the original session lifetime intent.
🔧 Proposed fix
async function signUploadUrl(
request: Request,
env: Env,
session: UploadSessionRecord,
file: UploadFileInput & { object_key?: string },
): Promise<string> {
if (!env.UPLOAD_SIGNING_SECRET) {
throw new Error("UPLOAD_SIGNING_SECRET is required");
}
- const exp = Math.floor(Date.now() / 1000) + ttlSeconds(env);
+ const nowSec = Math.floor(Date.now() / 1000);
+ const sessionExpSec = Math.floor(new Date(session.expires_at).getTime() / 1000);
+ const exp = Math.min(nowSec + ttlSeconds(env), sessionExpSec);Also applies to: 211-223
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@apps/upload/src/index.ts` around lines 161 - 164, The replay branch returns a
replayed UploadSessionRecord from peekUploadReplay and then calls
buildCreateSessionResponse which re-signs upload URLs; change this so replayed
responses cannot extend the original session: read the original session expiry
(e.g., expiresAt or ttl) from the returned UploadSessionRecord in the
peekUploadReplay flow and pass that expiry into buildCreateSessionResponse (or
into the URL signing routine such as the sign/upload URL helper) so the signed
URL TTL is computed as max(0, originalExpiry - now) instead of using a fresh
now+ttl; update buildCreateSessionResponse or the signing helper to accept an
optional expiry cutoff and use it when re-signing in the upload.session.create
replay path.
|
/preview |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
agent-paste PR preview is ready. API: https://agent-paste-api-pr-9.isaac-a46.workers.dev |
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The new assertActorRateLimitFires helper exhausts the actor's 60/min budget by hammering an upload mutation. The next assertion, assertBytesPurgedAfterExpiry, republishes through the same user API key and was hitting rate_limited_actor at the publish step. Reorder so the rate-limit probe runs last; bytes-after-delete and -after-expiry now run on a fresh budget. Also reconcile docs/ops/project-status.md: PR #6 (CSP) and PR #8 (bytes-after-delete/expiry) closed two backlog items without updating this doc. Moved both into Recently Completed and renumbered. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/ops/project-status.md (1)
3-3:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the "Last updated" date to match this PR's date.
The date should be
2026-05-22to reflect when these changes were made, not2026-05-21.📅 Proposed fix
-Last updated: 2026-05-21 (idempotent mutations and security baseline landed). +Last updated: 2026-05-22 (idempotent mutations and security baseline landed).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/ops/project-status.md` at line 3, Update the "Last updated" metadata line in the document by changing the date string "2026-05-21" to "2026-05-22" on the line that currently reads 'Last updated: 2026-05-21 (idempotent mutations and security baseline landed).' so the file's header reflects the PR date.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/ops/project-status.md`:
- Line 254: The status line "Status: Done on 2026-05-21 via PR `#9`." has a date
mismatch with PR `#9` (created 2026-05-22); update that line to either match the
PR creation date ("Status: Done on 2026-05-22 via PR `#9`.") or clarify both
events (e.g., "Work completed 2026-05-21; PR `#9` opened 2026-05-22") by editing
the exact string that contains "Status: Done on 2026-05-21 via PR `#9`." in the
document.
---
Outside diff comments:
In `@docs/ops/project-status.md`:
- Line 3: Update the "Last updated" metadata line in the document by changing
the date string "2026-05-21" to "2026-05-22" on the line that currently reads
'Last updated: 2026-05-21 (idempotent mutations and security baseline landed).'
so the file's header reflects the PR date.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: dec6ca35-8790-4eca-9785-4785f6b4e686
📒 Files selected for processing (2)
docs/ops/project-status.mdscripts/smoke-hosted.mjs
|
|
||
| ### Enforce native rate-limit bindings | ||
|
|
||
| - Status: Done on 2026-05-21 via PR #9. |
There was a problem hiding this comment.
Date mismatch: PR #9 was created on 2026-05-22, not 2026-05-21.
Per the PR metadata, this PR was created at 2026-05-22T01:57:26Z. If the work was completed on 2026-05-21, consider clarifying (e.g., "Work completed 2026-05-21; PR #9 opened 2026-05-22"), otherwise update the date to match the PR creation date.
📅 Proposed fix (if aligning to PR creation date)
-- Status: Done on 2026-05-21 via PR `#9`.
+- Status: Done on 2026-05-22 via PR `#9`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Status: Done on 2026-05-21 via PR #9. | |
| - Status: Done on 2026-05-22 via PR `#9`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/ops/project-status.md` at line 254, The status line "Status: Done on
2026-05-21 via PR `#9`." has a date mismatch with PR `#9` (created 2026-05-22);
update that line to either match the PR creation date ("Status: Done on
2026-05-22 via PR `#9`.") or clarify both events (e.g., "Work completed
2026-05-21; PR `#9` opened 2026-05-22") by editing the exact string that contains
"Status: Done on 2026-05-21 via PR `#9`." in the document.
Sequential 120 attempts at ~2/sec spread across multiple Cloudflare worker isolates means no single isolate ever sees the 60-in-60s threshold. Replace the serial loop with four parallel waves of 80 requests each. The burst forces a single isolate's counter past the limit before the trailing window slides. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
agent-paste PR preview is ready. API: https://agent-paste-api-pr-9.isaac-a46.workers.dev |
PR #2 added `permissions: administration: write` to the cleanup workflow, but `administration` is not a valid GITHUB_TOKEN scope. GitHub rejected the workflow at evaluation time, silently dropping every `pull_request.closed` event for PRs #2--#9 and accumulating eight stale `preview/pr-N` Neon branches that tripped the 10-branch free-tier cap (blocks PR #10/#11/#12 deploys with HTTP 422). Drop the invalid permission and the `deleteAnEnvironment` step that required it; rename the file so GitHub registers a fresh workflow id instead of reusing the wedged record; validate the resolved PR number is a positive integer before deleting anything. Stale Neon branches still need a one-time operator purge -- agent is forbidden from calling `neondatabase/delete-branch-action` autonomously. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PR #2 added `permissions: administration: write` to the cleanup workflow, but `administration` is not a valid GITHUB_TOKEN scope. GitHub rejected the workflow at evaluation time, silently dropping every `pull_request.closed` event for PRs #2--#9 and accumulating eight stale `preview/pr-N` Neon branches that tripped the 10-branch free-tier cap (blocks PR #10/#11/#12 deploys with HTTP 422). Drop the invalid permission and the `deleteAnEnvironment` step that required it; rename the file so GitHub registers a fresh workflow id instead of reusing the wedged record; validate the resolved PR number is a positive integer before deleting anything. Stale Neon branches still need a one-time operator purge -- agent is forbidden from calling `neondatabase/delete-branch-action` autonomously. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Idempotent retries on the upload mutation routes now short-circuit before either rate-limit binding is touched, so a client replaying a known-successful command no longer burns through its per-actor or per-workspace budget. Genuinely new mutations still rate-limit normally.
Changes
peekIdempotentReplay(SELECT-only) to@agent-paste/commandsplus unit coverage for the hit/miss/in-flight cases.apps/upload/src/index.tsforPOST /v1/upload-sessionsandPOST /v1/upload-sessions/:id/finalize. On cache hit the route shapes the cachedUploadSessionRecordor finalize result through the existing signing helpers and returns 200, skippingACTOR_RATE_LIMIT.limit()/WORKSPACE_BURST_CAP.limit().scripts/smoke-hosted.mjs) hammers the upload mutation with unique idempotency keys against preview/PR targets and asserts a 429 witherror.code = rate_limited_actorandRetry-After: 60. Skipped onproduction.Done
rateLimitAuthenticatedRequeston both upload mutation routes.pnpm verify(49 turbo tasks) green.pnpm smoke:localgreen.Files touched
packages/commands/src/index.tspackages/commands/src/index.test.tsapps/upload/src/index.tsapps/upload/src/index.test.tsscripts/smoke-hosted.mjsSummary by CodeRabbit
New Features
Tests
Documentation