Response/cleanup unsubmitted forms#31
Conversation
WalkthroughThis pull request introduces operational documentation for assessment prioritization, incident response analysis, and inbox management processes, alongside a new TypeScript utility function for cleaning up expired unsubmitted form tokens from the database using transactional batching. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts (2)
8-8: Two minor style nits.
BATCH_SIZE(Line 8) is a fixed configuration constant; hoisting it to module scope makes it easier to tune without opening the function body.- The inline object types on lines 29 and 33 (
{ token: string; entityId: string | null }) are redundant — Prisma'sselectalready typesexpiredTokenscorrectly, so TypeScript infers these without annotation.♻️ Proposed clean-up
+const BATCH_SIZE = 1000; + export const cleanup_unsubmitted_forms = async (job: JobScheduleQueue) => { - const BATCH_SIZE = 1000; // ... - const entityIds = expiredTokens - .map((token: { token: string; entityId: string | null }) => token.entityId) - .filter((id: string | null): id is string => Boolean(id)); + const entityIds = expiredTokens + .map((t) => t.entityId) + .filter((id): id is string => Boolean(id)); - const tokens = expiredTokens.map( - (token: { token: string; entityId: string | null }) => token.token - ); + const tokens = expiredTokens.map((t) => t.token);Also applies to: 29-33
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts` at line 8, Hoist the BATCH_SIZE constant out of the function into module scope as a top-level constant (replace the local const BATCH_SIZE = 1000 with a module-level one) and remove the redundant inline type annotations on the select results (remove the `{ token: string; entityId: string | null }` annotations around the `expiredTokens` usages) so TypeScript uses Prisma's inferred types; look for uses of BATCH_SIZE and the variables named expiredTokens in this file to update them accordingly.
70-73:update_job_statusin thecatchblock can itself throw, masking the original error.If
update_job_status(job.id, "failed")throws, the originalerroris replaced. Wrap it, or use afinallyblock with conditional status.♻️ Proposed fix
- } catch (error) { - console.error("[Cleanup Job] Error:", error); - await update_job_status(job.id, "failed"); - throw error; - } + } catch (error) { + console.error("[Cleanup Job] Error:", error); + try { + await update_job_status(job.id, "failed"); + } catch (statusError) { + console.error("[Cleanup Job] Failed to update job status:", statusError); + } + throw error; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts` around lines 70 - 73, The catch block currently calls update_job_status(job.id, "failed") which may throw and mask the original error; to fix, capture the original error (the caught error variable), then call update_job_status inside its own try/catch (or a finally that only attempts status update) so any failure updating the job is logged but does not overwrite the original error, and finally rethrow the original caught error; look for the catch around console.error("[Cleanup Job] Error:", error) and update_job_status(job.id, "failed") to apply this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/monday_inbox/RESPONSE.MD`:
- Around line 159-167: In the delegation table fix the typos in the three
affected rows: change "suepect" to "suspect" and capitalize "semir" to "Semir"
in the row containing "Find who owns vula-scraper first"; remove the duplicated
word so "doesn't doesn't" becomes "doesn't" in the row with "Zod discussion";
capitalize the sentence-start "he'll research" to "He'll research" in the
"AutoClaude evaluation | Biruk" row; and capitalize "i'll use" to "I'll use" in
the "AWS cost spike" row so each sentence starts with a capital letter.
In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts`:
- Around line 65-67: The log currently prints input array sizes (tokens.length,
entityIds.length) instead of actual deleted row counts; update the code that
calls Prisma's deleteMany (e.g., the delete calls around deleteMany for tokens
and entities) to await and capture their return values (e.g., tokenDeleteResult
and entityDeleteResult) and log tokenDeleteResult.count and
entityDeleteResult.count (replace the console.log message to use these .count
values) so the cleanup job reports actual rows deleted.
- Line 8: The current job only fetches up to BATCH_SIZE (const BATCH_SIZE =
1000) once using take: BATCH_SIZE and then marks the job "completed", which can
leave expired tokens uncleaned and is non-deterministic because the query lacks
orderBy; modify the cleanup task (the function that queries and deletes expired
tokens using take: BATCH_SIZE) to repeatedly fetch and delete in a loop until a
batch returns fewer than BATCH_SIZE items (i.e., process pages) and add a
deterministic orderBy (e.g., orderBy: { id: "asc" } or createdAt) to the query
so batching is consistent between runs; alternatively, if you prefer Option B,
detect when a fetched batch length === BATCH_SIZE and instead of marking the job
completed, re-schedule or return a status indicating more work remains so the
job runner re-invokes the task.
In `@tests/unsubmitted_forms/SUBMISSION.md`:
- Line 4: The ATX heading line currently has two spaces after the hashes ("##
Issue Breakdown"), which triggers markdownlint MD019; update that heading to use
a single space by changing it to "## Issue Breakdown" so there is exactly one
space between "##" and the heading text.
---
Nitpick comments:
In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts`:
- Line 8: Hoist the BATCH_SIZE constant out of the function into module scope as
a top-level constant (replace the local const BATCH_SIZE = 1000 with a
module-level one) and remove the redundant inline type annotations on the select
results (remove the `{ token: string; entityId: string | null }` annotations
around the `expiredTokens` usages) so TypeScript uses Prisma's inferred types;
look for uses of BATCH_SIZE and the variables named expiredTokens in this file
to update them accordingly.
- Around line 70-73: The catch block currently calls update_job_status(job.id,
"failed") which may throw and mask the original error; to fix, capture the
original error (the caught error variable), then call update_job_status inside
its own try/catch (or a finally that only attempts status update) so any failure
updating the job is logged but does not overwrite the original error, and
finally rethrow the original caught error; look for the catch around
console.error("[Cleanup Job] Error:", error) and update_job_status(job.id,
"failed") to apply this change.
| | Grafana alert | Find who owns vula-scraper first | The actual service owner should investigate, I suepect semir owns it. | | ||
| | Redis decision | Patrick | He works on vula-backend core where this lives. 10k/day is low enough that either choice works - I just need to tell him he's empowered to decide and doesn't need my approval. | | ||
| | Missing AC | Spencer escalates, I step in if needed | Spencer needs to learn to unblock himself by going to product directly. But I'll give him a deadline (1pm) and tell him to ping me if product doesn't respond. | | ||
| | 5MB bug | Identify file upload owner | Need to know who actually owns this before I assign it. | | ||
| | API docs | Junior + Niko (if Niko has time) | Junior needs to learn our patterns anyway and Niko's the backend lead. But Niko's got the entity refactor this week, so I'll let him defer to next week if he's underwater. | | ||
| | Zod discussion | Niko | He proposed it, he should own it. But he should get feedback in Slack standup today so it doesn't doesn't burn meeting time if everyone agrees. | | ||
| | AutoClaude evaluation | Biruk | He's used it and loves it. he'll research cost/security and preps the demo. Team decides together at Showcase, not me or him alone. | | ||
| | Architecture PR review | Me + Patrick both review | I need to understand it (it's core data model), but Patrick works in vula-backend daily and should review for implementation. Two perspectives better than one. | | ||
| | AWS cost spike | Semir gets data, I explain to Finance | Semir knows what scaled on Render. I'll translate that into business terms for Finance/CEO. i'll use his expertise, I handle the communication. | |
There was a problem hiding this comment.
Fix several typographical errors in the delegation table.
Three typos in close proximity:
- Line 159:
suepect→suspect,semir→Semir - Line 164:
doesn't doesn't→doesn't(duplicate word) - Line 165:
he'll research→He'll research(sentence starts lowercase after a period) - Line 167:
i'll use→I'll use
🧰 Tools
🪛 LanguageTool
[style] ~161-~161: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...scalates, I step in if needed | Spencer needs to learn to unblock himself by going to pr...
(REP_NEED_TO_VB)
[style] ~161-~161: Consider using the more polite verb “ask” (“tell” implies ordering/instructing someone).
Context: ... But I'll give him a deadline (1pm) and tell him to ping me if product doesn't respo...
(TELL_ASK)
[style] ~162-~162: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...dentify file upload owner | Need to know who actually owns this before I as...
(REP_NEED_TO_VB)
[style] ~163-~163: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... Niko (if Niko has time) | Junior needs to learn our patterns anyway and Niko's th...
(REP_NEED_TO_VB)
[style] ~164-~164: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... | He proposed it, he should own it. But he should get feedback in Slack standup...
(REP_SENT_START_BUT)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/monday_inbox/RESPONSE.MD` around lines 159 - 167, In the delegation
table fix the typos in the three affected rows: change "suepect" to "suspect"
and capitalize "semir" to "Semir" in the row containing "Find who owns
vula-scraper first"; remove the duplicated word so "doesn't doesn't" becomes
"doesn't" in the row with "Zod discussion"; capitalize the sentence-start "he'll
research" to "He'll research" in the "AutoClaude evaluation | Biruk" row; and
capitalize "i'll use" to "I'll use" in the "AWS cost spike" row so each sentence
starts with a capital letter.
| export const cleanup_unsubmitted_forms = async ( | ||
| job: JobScheduleQueue | ||
| ) => { | ||
| const BATCH_SIZE = 1000; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.tsRepository: vula-africa/coding_tests
Length of output: 2682
🏁 Script executed:
rg -n "update_job_status|JobScheduleQueue|job.*status" --type ts -A 3 -B 3Repository: vula-africa/coding_tests
Length of output: 4437
🏁 Script executed:
rg -n "pending|completed" --type ts -A 2 -B 2Repository: vula-africa/coding_tests
Length of output: 1465
🏁 Script executed:
find . -name "generic_scheduler*" -type fRepository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
rg -l "update_job_status|JobScheduleQueue" --type ts | head -20Repository: vula-africa/coding_tests
Length of output: 184
🏁 Script executed:
rg -n "export.*update_job_status|function update_job_status" --type tsRepository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
find . -type f -name "*.ts" | xargs rg -l "scheduler\|cron\|queue" | head -20Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
rg -n "JobScheduleQueue" --type tsRepository: vula-africa/coding_tests
Length of output: 531
🏁 Script executed:
rg -n "completed\|pending" --type ts -B 2 -A 2 | grep -i "scheduler\|re-run\|re-invoke"Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
cat -n prisma/schema.prisma 2>/dev/null | grep -A 10 -B 2 "JobScheduleQueue"Repository: vula-africa/coding_tests
Length of output: 50
Fix the single-batch processing that may leave expired tokens uncleaned and add deterministic ordering.
The code processes exactly 1,000 records per invocation via take: BATCH_SIZE without a loop. If more than 1,000 expired tokens exist, the remaining tokens are never deleted because the job is marked "completed" at line 69. Additionally, the query lacks orderBy, making batch selection non-deterministic across runs.
Implement one of the following:
♻️ Option A — loop within a single run (preferred for thoroughness)
- const BATCH_SIZE = 1000;
+const BATCH_SIZE = 1000;
export const cleanup_unsubmitted_forms = async (job: JobScheduleQueue) => {
try {
+ let totalTokens = 0;
+ let totalEntities = 0;
+
+ while (true) {
const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000);
const expiredTokens = await prisma.publicFormsTokens.findMany({
where: { createdAt: { lt: sevenDaysAgo } },
select: { token: true, entityId: true },
take: BATCH_SIZE,
+ orderBy: { createdAt: "asc" },
});
- if (expiredTokens.length === 0) {
- console.log("[Cleanup Job] No expired tokens found.");
- await update_job_status(job.id, "completed");
- return;
- }
+ if (expiredTokens.length === 0) break;
// ... process batch, accumulate totals ...
+ totalTokens += tokens.length;
+ totalEntities += entityIds.length;
+ if (expiredTokens.length < BATCH_SIZE) break;
+ }
- console.log(`[Cleanup Job] Deleted ${tokens.length} tokens and ${entityIds.length} entities`);
+ console.log(`[Cleanup Job] Deleted ${totalTokens} tokens and ${totalEntities} entities`);
await update_job_status(job.id, "completed");♻️ Option B — single batch but signal for re-invocation when full
+ const batchFull = expiredTokens.length === BATCH_SIZE;
// ... process batch ...
- await update_job_status(job.id, "completed");
+ await update_job_status(job.id, batchFull ? "pending" : "completed");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts` at line 8,
The current job only fetches up to BATCH_SIZE (const BATCH_SIZE = 1000) once
using take: BATCH_SIZE and then marks the job "completed", which can leave
expired tokens uncleaned and is non-deterministic because the query lacks
orderBy; modify the cleanup task (the function that queries and deletes expired
tokens using take: BATCH_SIZE) to repeatedly fetch and delete in a loop until a
batch returns fewer than BATCH_SIZE items (i.e., process pages) and add a
deterministic orderBy (e.g., orderBy: { id: "asc" } or createdAt) to the query
so batching is consistent between runs; alternatively, if you prefer Option B,
detect when a fetched batch length === BATCH_SIZE and instead of marking the job
completed, re-schedule or return a status indicating more work remains so the
job runner re-invokes the task.
| console.log( | ||
| `[Cleanup Job] Deleted ${tokens.length} tokens and ${entityIds.length} entities` | ||
| ); |
There was a problem hiding this comment.
Log reports queried counts, not actual deleted row counts.
tokens.length and entityIds.length reflect the size of the input arrays, not what Prisma actually removed. deleteMany returns { count: number } — capture and log that instead for accurate observability.
🔭 Proposed fix
- await prisma.$transaction(async (tx: Prisma.TransactionClient) => {
+ const result = await prisma.$transaction(async (tx: Prisma.TransactionClient) => {
+ let deletedRelationships = 0, deletedCorpus = 0, deletedEntities = 0;
if (entityIds.length > 0) {
- await tx.relationship.deleteMany({ where: { entity_id: { in: entityIds }, status: "new" } });
- await tx.new_corpus.deleteMany({ where: { entity_id: { in: entityIds } } });
- await tx.entity.deleteMany({ where: { id: { in: entityIds } } });
+ ({ count: deletedRelationships } = await tx.relationship.deleteMany({ where: { entity_id: { in: entityIds }, status: "new" } }));
+ ({ count: deletedCorpus } = await tx.new_corpus.deleteMany({ where: { entity_id: { in: entityIds } } }));
+ ({ count: deletedEntities } = await tx.entity.deleteMany({ where: { id: { in: entityIds } } }));
}
- await tx.publicFormsTokens.deleteMany({ where: { token: { in: tokens } } });
+ const { count: deletedTokens } = await tx.publicFormsTokens.deleteMany({ where: { token: { in: tokens } } });
+ return { deletedRelationships, deletedCorpus, deletedEntities, deletedTokens };
});
console.log(
- `[Cleanup Job] Deleted ${tokens.length} tokens and ${entityIds.length} entities`
+ `[Cleanup Job] Deleted ${result.deletedTokens} tokens, ${result.deletedEntities} entities, ` +
+ `${result.deletedRelationships} relationships, ${result.deletedCorpus} corpus entries`
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts` around lines
65 - 67, The log currently prints input array sizes (tokens.length,
entityIds.length) instead of actual deleted row counts; update the code that
calls Prisma's deleteMany (e.g., the delete calls around deleteMany for tokens
and entities) to await and capture their return values (e.g., tokenDeleteResult
and entityDeleteResult) and log tokenDeleteResult.count and
entityDeleteResult.count (replace the console.log message to use these .count
values) so the cleanup job reports actual rows deleted.
|
|
||
| --- | ||
|
|
||
| ## Issue Breakdown |
There was a problem hiding this comment.
Fix double space in ATX heading.
## Issue Breakdown has two spaces after ##, triggering markdownlint MD019.
📝 Proposed fix
-## Issue Breakdown
+## Issue Breakdown📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Issue Breakdown | |
| ## Issue Breakdown |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 4-4: Multiple spaces after hash on atx style heading
(MD019, no-multiple-space-atx)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unsubmitted_forms/SUBMISSION.md` at line 4, The ATX heading line
currently has two spaces after the hashes ("## Issue Breakdown"), which
triggers markdownlint MD019; update that heading to use a single space by
changing it to "## Issue Breakdown" so there is exactly one space between "##"
and the heading text.
Implemented cleanup job with 7-day expiry, batching, and transactional deletes.
Added explicit typing plus logging/status updates.
Added unsubmitted forms analysis and Loom video.
Updated assessment summary with cleanup highlights.
Files: unsubmitted_forms/submission_cleanup_unsubmitted_forms.ts, unsubmitted_forms/SUBMISSION.md, ASSESSMENT_SUMMARY.MD.
Summary by CodeRabbit
Documentation
New Features