Fix critical bugs and improve performance in form cleanup job#25
Fix critical bugs and improve performance in form cleanup job#25Abiorh001 wants to merge 1 commit into
Conversation
WalkthroughThis change refactors the unsubmitted forms cleanup task to introduce batching for scalable processing, add comprehensive metrics tracking (tokens found, deleted, errors, skipped), optimize relationship resolution through batch queries, and implement per-token transactional processing with robust error handling and detailed failure reporting. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Important Action Needed: IP Allowlist UpdateIf your organization protects your Git platform with IP whitelisting, please add the new CodeRabbit IP address to your allowlist:
Failure to add the new IP will result in interrupted reviews. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`:
- Around line 87-89: The catch block currently logs the raw credential via
token.token; change the log to avoid exposing the full token by logging a
non-sensitive identifier (e.g., token.id) or a masked prefix of token.token
(e.g., first 4 chars + ellipsis) instead, and keep the existing error object in
the message; update the catch handling where stats.errors++ and console.error
are used so it references the safe identifier (token.id or maskedToken) rather
than token.token.
- Around line 71-83: The current loop increments stats.skipped and continues
when !relationship || !token.entityId, which leaves expired public form tokens
undeleted; change the logic so the token deletion always runs (call
tx.publicFormsTokens.delete for token.id inside the transaction) while
conditionally deleting relationship (if relationship exists, call
tx.relationship.delete with relationship.id) and conditionally deleting
corpus/entity only when token.entityId is present (tx.new_corpus.deleteMany and
tx.entity.delete with token.entityId); keep the whole set of deletes inside
prisma.$transaction to ensure atomicity and remove the stats.skipped increment
only for truly skipped entity/relationship cleanup cases, not for leaving tokens
behind.
- Around line 41-53: The loop uses skip-based pagination with
prisma.publicFormsTokens.findMany which can miss records when deletions occur;
change to cursor-based pagination by ordering by a stable field (e.g.,
createdAt, id) via orderBy and use the last returned record as the cursor for
the next findMany call instead of incrementing skip/BATCH_SIZE; update both
occurrences (the loop starting near prisma.publicFormsTokens.findMany and the
similar block around line 93) to pass a cursor object and remove skip logic so
each expired token is processed exactly once.
| const sevenDaysAgo = new Date(Date.now() - 7 * 24 * 60 * 60 * 1000); | ||
| let skip = 0; | ||
|
|
||
| while (true) { | ||
| const expiredTokens = await prisma.publicFormsTokens.findMany({ | ||
| where: { | ||
| createdAt: { lte: sevenDaysAgo }, | ||
| }, | ||
| }, | ||
| }); | ||
| skip: skip, | ||
| take: BATCH_SIZE, | ||
| }); | ||
|
|
||
| if (expiredTokens.length === 0) break; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/unsubmitted_forms/cleanup_unsubmitted_forms.tsRepository: vula-africa/coding_tests
Length of output: 4334
🏁 Script executed:
find . -type f \( -name "*.prisma" -o -name "*schema*" \) | head -20Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
rg "publicFormsTokens" -t ts -t prisma -B 2 -A 2 | head -40Repository: vula-africa/coding_tests
Length of output: 98
🏁 Script executed:
rg "publicFormsTokens" -t ts -B 2 -A 2 | head -40Repository: vula-africa/coding_tests
Length of output: 1114
🏁 Script executed:
rg "model publicFormsTokens" -A 15Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
find . -name "*.prisma" -o -name "schema.prisma"Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
find . -type f -name "*.ts" -o -name "*.js" | xargs rg "interface.*publicFormsTokens|type.*publicFormsTokens" | head -20Repository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
rg "where.*id.*token.id" tests/unsubmitted_forms/cleanup_unsubmitted_forms.tsRepository: vula-africa/coding_tests
Length of output: 143
Switch to cursor-based pagination with orderBy to prevent skipping tokens during deletion.
Without orderBy, Prisma doesn't guarantee consistent ordering across queries. Combined with skip-based pagination and mid-loop deletions, this causes remaining records to be skipped unpredictably. Use cursor pagination with a stable sort order to ensure each expired token is processed exactly once.
♻️ Suggested cursor-based pagination
- let skip = 0;
+ let cursor: { id: string } | undefined;
...
const expiredTokens = await prisma.publicFormsTokens.findMany({
where: {
createdAt: { lte: sevenDaysAgo },
},
- skip: skip,
- take: BATCH_SIZE,
+ orderBy: { id: "asc" },
+ take: BATCH_SIZE,
+ ...(cursor ? { cursor, skip: 1 } : {}),
});
if (expiredTokens.length === 0) break;
+ cursor = { id: expiredTokens[expiredTokens.length - 1].id };
...
- skip += BATCH_SIZE;Also applies to: line 93
🤖 Prompt for AI Agents
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 41 - 53,
The loop uses skip-based pagination with prisma.publicFormsTokens.findMany which
can miss records when deletions occur; change to cursor-based pagination by
ordering by a stable field (e.g., createdAt, id) via orderBy and use the last
returned record as the cursor for the next findMany call instead of incrementing
skip/BATCH_SIZE; update both occurrences (the loop starting near
prisma.publicFormsTokens.findMany and the similar block around line 93) to pass
a cursor object and remove skip logic so each expired token is processed exactly
once.
| const relationship = relationshipMap.get(token.productId); | ||
|
|
||
| if (!relationship || !token.entityId) { | ||
| stats.skipped++; | ||
| continue; | ||
| } | ||
|
|
||
| await prisma.$transaction(async (tx) => { | ||
| await tx.relationship.delete({ where: { id: relationship.id } }); | ||
| await tx.publicFormsTokens.delete({ where: { id: token.id } }); | ||
| await tx.new_corpus.deleteMany({ where: { entity_id: token.entityId } }); | ||
| await tx.entity.delete({ where: { id: token.entityId } }); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find and inspect schema.prisma for PublicFormsTokens model
fd -a 'schema.prisma' -t f -exec cat {} \;Repository: vula-africa/coding_tests
Length of output: 255
🏁 Script executed:
#!/bin/bash
# Find schema.prisma file
fd 'schema.prisma' -t fRepository: vula-africa/coding_tests
Length of output: 50
🏁 Script executed:
#!/bin/bash
# Search for any prisma files and directories
echo "=== Searching for .prisma files ==="
fd '\\.prisma$' -t f
echo ""
echo "=== Checking for prisma directory ==="
fd '^prisma$' -t d
echo ""
echo "=== Listing repository structure ==="
git ls-files | head -30Repository: vula-africa/coding_tests
Length of output: 408
🏁 Script executed:
#!/bin/bash
# Read the cleanup_unsubmitted_forms.ts file to understand the test
cat -n tests/unsubmitted_forms/cleanup_unsubmitted_forms.tsRepository: vula-africa/coding_tests
Length of output: 4334
Skipping on missing relationship/entityId leaves expired tokens behind.
This code increments skipped and continues without deleting the token when either !relationship or !token.entityId. This causes expired tokens to accumulate in the database indefinitely, directly contradicting the stated goal to "prevent the database from being cluttered with unused tokens."
The token should always be deleted (it's the primary artifact to clean up). Only the entity and related corpus entries should be conditionally skipped when entityId is missing. The relationship, if present, should also always be deleted.
🛠️ Suggested conditional cleanup
- if (!relationship || !token.entityId) {
- stats.skipped++;
- continue;
- }
-
- await prisma.$transaction(async (tx) => {
- await tx.relationship.delete({ where: { id: relationship.id } });
- await tx.publicFormsTokens.delete({ where: { id: token.id } });
- await tx.new_corpus.deleteMany({ where: { entity_id: token.entityId } });
- await tx.entity.delete({ where: { id: token.entityId } });
- });
+ if (!token.entityId) {
+ stats.skipped++;
+ await prisma.$transaction(async (tx) => {
+ if (relationship) {
+ await tx.relationship.delete({ where: { id: relationship.id } });
+ }
+ await tx.publicFormsTokens.delete({ where: { id: token.id } });
+ });
+ continue;
+ }
+
+ await prisma.$transaction(async (tx) => {
+ if (relationship) {
+ await tx.relationship.delete({ where: { id: relationship.id } });
+ }
+ await tx.publicFormsTokens.delete({ where: { id: token.id } });
+ await tx.new_corpus.deleteMany({ where: { entity_id: token.entityId } });
+ await tx.entity.delete({ where: { id: token.entityId } });
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const relationship = relationshipMap.get(token.productId); | |
| if (!relationship || !token.entityId) { | |
| stats.skipped++; | |
| continue; | |
| } | |
| await prisma.$transaction(async (tx) => { | |
| await tx.relationship.delete({ where: { id: relationship.id } }); | |
| await tx.publicFormsTokens.delete({ where: { id: token.id } }); | |
| await tx.new_corpus.deleteMany({ where: { entity_id: token.entityId } }); | |
| await tx.entity.delete({ where: { id: token.entityId } }); | |
| }); | |
| const relationship = relationshipMap.get(token.productId); | |
| if (!token.entityId) { | |
| stats.skipped++; | |
| await prisma.$transaction(async (tx) => { | |
| if (relationship) { | |
| await tx.relationship.delete({ where: { id: relationship.id } }); | |
| } | |
| await tx.publicFormsTokens.delete({ where: { id: token.id } }); | |
| }); | |
| continue; | |
| } | |
| await prisma.$transaction(async (tx) => { | |
| if (relationship) { | |
| await tx.relationship.delete({ where: { id: relationship.id } }); | |
| } | |
| await tx.publicFormsTokens.delete({ where: { id: token.id } }); | |
| await tx.new_corpus.deleteMany({ where: { entity_id: token.entityId } }); | |
| await tx.entity.delete({ where: { id: token.entityId } }); | |
| }); |
🤖 Prompt for AI Agents
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 71 - 83,
The current loop increments stats.skipped and continues when !relationship ||
!token.entityId, which leaves expired public form tokens undeleted; change the
logic so the token deletion always runs (call tx.publicFormsTokens.delete for
token.id inside the transaction) while conditionally deleting relationship (if
relationship exists, call tx.relationship.delete with relationship.id) and
conditionally deleting corpus/entity only when token.entityId is present
(tx.new_corpus.deleteMany and tx.entity.delete with token.entityId); keep the
whole set of deletes inside prisma.$transaction to ensure atomicity and remove
the stats.skipped increment only for truly skipped entity/relationship cleanup
cases, not for leaving tokens behind.
| } catch (error) { | ||
| stats.errors++; | ||
| console.error(`Error processing token ${token.token}:`, error); |
There was a problem hiding this comment.
Don’t log raw token strings.
Token values are credentials; logging them risks leakage. Log the token id or a masked prefix instead.
🔒 Mask token in logs
- console.error(`Error processing token ${token.token}:`, error);
+ console.error(`Error processing token ${token.id}:`, error);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (error) { | |
| stats.errors++; | |
| console.error(`Error processing token ${token.token}:`, error); | |
| } catch (error) { | |
| stats.errors++; | |
| console.error(`Error processing token ${token.id}:`, error); |
🤖 Prompt for AI Agents
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 87 - 89,
The catch block currently logs the raw credential via token.token; change the
log to avoid exposing the full token by logging a non-sensitive identifier
(e.g., token.id) or a masked prefix of token.token (e.g., first 4 chars +
ellipsis) instead, and keep the existing error object in the message; update the
catch handling where stats.errors++ and console.error are used so it references
the safe identifier (token.id or maskedToken) rather than token.token.
Fixed multiple critical issues in the daily cleanup job that prevented expired tokens from being deleted. The job had date calculation bugs, inefficient queries, and poor error handling.
Issues Fixed
Performance Improvements
Key Changes
Date.now() - 7*24*60*60*1000(was missing*1000)createdAt: { lte: sevenDaysAgo }(wasgte/ltwindow)Trade-offs
loom video link: https://www.loom.com/share/ba4b4b51d94b47cb85c4ac83dd27336c
Summary by CodeRabbit
Chores