fix: Correct date logic and relationship matching in unsubmitted forms cleanup job by hungaikev · Pull Request #27 · vula-africa/coding_tests

hungaikev · 2026-02-18T13:34:58Z

Summary

Fixes critical bugs in the cleanup_unsubmitted_forms scheduled job that prevented it from correctly identifying and deleting expired unsubmitted forms. The job now properly handles date filtering, relationship matching, and includes improved error handling and performance optimizations.

Issues Fixed

🔴 Critical Bugs

Incorrect date logic - Changed gte (greater than or equal) to lte (less than or equal) to correctly find tokens older than 7 days
Missing milliseconds in date calculation - Added * 1000 to properly convert days to milliseconds
Incorrect relationship matching - Relationship query now matches by both product_id AND entity_id to prevent deleting wrong relationships
Unsafe null handling - Replaced token.entityId || "" with proper null checks to avoid deleting records with empty string IDs
Incorrect deletion order - Fixed order to: Corpus → Relationship → Token → Entity to avoid foreign key constraint violations

⚠️ Improvements

Added batch processing - Processes tokens in batches of 100 to handle large datasets efficiently
Improved error handling - Per-token try-catch blocks prevent one failure from stopping the entire job
Enhanced observability - Added comprehensive statistics tracking (tokens found, deleted, skipped, errors)
Better transaction handling - Switched to transaction callback form for improved error handling

Changes

Fixed date query to use lte: sevenDaysAgo instead of gte: sevenDaysAgo
Added entity_id filter to relationship query to ensure correct matching
Implemented batch processing with configurable BATCH_SIZE (100)
Added upfront filtering of tokens without entityId
Fixed deletion order to respect foreign key constraints
Added per-token error handling with detailed logging
Implemented statistics tracking for monitoring and debugging

Testing

Verify date logic correctly identifies tokens older than 7 days
Test relationship matching with multiple tokens sharing same product_id but different entity_id
Confirm deletion order doesn't violate foreign key constraints
Test error handling when individual token processing fails
Verify batch processing handles large datasets correctly

Performance Impact

Before: Sequential processing, N+1 queries for relationships
After: Batch processing, bulk relationship queries, ~10x fewer database queries for large datasets

Trade-offs

Batch size: Set to 100 as a balance between memory usage and query efficiency. Can be adjusted based on production metrics.
Error handling: Individual token failures are logged but don't stop the job, ensuring maximum cleanup coverage even with partial failures.
Relationship query: Now requires both product_id and entity_id, which is more restrictive but ensures correctness.

Summary by CodeRabbit

Chores
- Improved efficiency of cleanup process with batched operations and cursor-based pagination.
- Enhanced error handling and logging to track processing statistics and operations.

coderabbitai · 2026-02-18T13:35:25Z

Walkthrough

Refactors cleanup_unsubmitted_forms to use cursor-based, batched processing (BATCH_SIZE=100) with deterministic sorting, bulk relationship prefetch, per-token transactional deletions, immediate deletion of tokens without entityId, per-batch stats (tokensFound/tokensDeleted/errors/skipped), and final logging.

Changes

Cohort / File(s)	Summary
Token Cleanup Batching & Transaction Workflow `tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`	Adds cursor-based pagination (lastId) and deterministic ordering (createdAt asc, id asc) with BATCH_SIZE=100. Splits tokens by presence of entityId; deletes tokens without entityId immediately. Prefetches relationships in bulk (status "new"), maps by product_id/entity_id, processes tokens per-batch with per-token error handling and stats tracking. Per-token transactional steps: delete corpus items, delete relationships, delete token, then delete entity. Logs per-batch summary and overall completion; on outer failure logs aggregated stats. Removes previous date-window single-query approach.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,230,255,0.5)
    participant Job as JobScheduleQueue
    end
    rect rgba(220,255,220,0.5)
    participant DB as Database
    end
    rect rgba(255,240,200,0.5)
    participant Logger as Logger
    end

    Job->>DB: fetch batch of tokens (createdAt asc, id asc) where createdAt <= sevenDaysAgo, limit BATCH_SIZE using lastId
    DB-->>Job: tokens[]
    Job->>DB: fetch relationships in bulk for tokens' productIds/entityIds (status="new")
    DB-->>Job: relationships[]
    loop per token
      alt token has no entityId
        Job->>DB: delete token
        DB-->>Job: delete result
        Job->>Logger: increment tokensDeleted
      else token has entityId
        Job->>DB: begin transaction
        DB-->>Job: tx started
        Job->>DB: delete corpus items for entity
        Job->>DB: delete matching relationships
        Job->>DB: delete token
        Job->>DB: delete entity
        DB-->>Job: commit / or error
        Job->>Logger: update tokensDeleted or errors
      end
    end
    Job->>Logger: log batch summary (tokensFound, tokensDeleted, skipped, errors)
    opt more tokens (lastId advanced)
      Job->>DB: fetch next batch
    end
    Job->>Logger: log completion summary or failure

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Batches hop by, a tidy little spree,
I nibble old tokens, set the data free,
Transactions snug, relationships flee,
Counts tallied neat — a clean garden, whee! 🥕

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main changes: it fixes date logic (lte operator, millisecond conversion) and relationship matching (requiring both product_id and entity_id), which are the primary bug fixes in this changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts (2)

59-66: Expired tokens without entityId are never deleted and accumulate indefinitely.

Tokens failing the entityId filter are skipped but remain in the database. Since they're already past the 7-day threshold, they'll be re-fetched on every subsequent run without ever being cleaned up. Consider deleting these tokens (they have no entity to cascade to, so a simple delete suffices).

Proposed fix

       // Filter out tokens without entityId upfront
       const validTokens = expiredTokens.filter((t) => t.entityId);
-      stats.skipped += expiredTokens.length - validTokens.length;
+      const invalidTokens = expiredTokens.filter((t) => !t.entityId);
+      if (invalidTokens.length > 0) {
+        await prisma.publicFormsTokens.deleteMany({
+          where: { token: { in: invalidTokens.map((t) => t.token) } },
+        });
+        stats.tokensDeleted += invalidTokens.length;
+      }

       if (validTokens.length === 0) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 59 - 66,
Expired tokens filtered out by the entityId check (expiredTokens.filter((t) =>
t.entityId)) are never removed and will reappear on subsequent runs; when
validTokens.length === 0, delete the expiredTokens that lack an entityId instead
of only skipping them: call the existing delete/remove mechanism for those
tokens, increment the appropriate stats counter (e.g., stats.deleted) by the
number removed, and then continue advancing skip by BATCH_SIZE; reference
validTokens, expiredTokens, stats.skipped, BATCH_SIZE and the skip variable to
locate where to perform the deletion.

76-88: Good batch pre-fetch pattern. One edge case: if multiple relationships exist for the same product_id:entity_id pair with status "new", the Map keeps only the last one, leaving the others orphaned until the next run.

If this is a realistic scenario, consider using a Map<string, Relationship[]> (grouping by key) to process all matches.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 76 - 88,
The current grouping uses relationshipMap = new Map(relationships.map((r) =>
[`${r.product_id}:${r.entity_id}`, r])) which drops duplicates for the same
product_id:entity_id; change this to build a Map<string, Relationship[]> by
pushing each relationship into an array keyed by
`${r.product_id}:${r.entity_id}` (use the existing prisma.relationship.findMany
result), then when processing use the array for each key to handle every
matching relationship instead of only the last one—update any downstream code
that reads relationshipMap to iterate the arrays for full processing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`:
- Around line 44-53: The query using prisma.publicFormsTokens.findMany with
skip/take and no orderBy causes non-deterministic pagination; change to a
deterministic pagination strategy by adding an explicit orderBy (e.g., {
createdAt: 'asc', id: 'asc' } to break ties) and replace skip-based paging with
cursor-based paging (use the last returned record's id as cursor in subsequent
calls) or, if keeping offset, ensure orderBy uses createdAt and id so pages are
deterministic; update the loop to track the last cursor (or last createdAt+id)
and use prisma's cursor/skip semantics accordingly (reference
prisma.publicFormsTokens.findMany, BATCH_SIZE, skip, sevenDaysAgo).
- Line 42: The current offset-based pagination using skip and BATCH_SIZE in
tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts loses rows when deletes
occur; replace it with cursor-based pagination: stop using skip, introduce a
lastId (or cursor) variable initialized to 0/null, change the query that
currently uses OFFSET skip to instead filter by the primary key (e.g.
publicFormsTokens.id) with "WHERE id > lastId ORDER BY id ASC LIMIT BATCH_SIZE"
(adjust the id field name if your PK is different), and at the end of each loop
set lastId to the max id from the batch (or the last row's id) instead of doing
skip += BATCH_SIZE so deletions won't cause missed rows.

---

Nitpick comments:
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`:
- Around line 59-66: Expired tokens filtered out by the entityId check
(expiredTokens.filter((t) => t.entityId)) are never removed and will reappear on
subsequent runs; when validTokens.length === 0, delete the expiredTokens that
lack an entityId instead of only skipping them: call the existing delete/remove
mechanism for those tokens, increment the appropriate stats counter (e.g.,
stats.deleted) by the number removed, and then continue advancing skip by
BATCH_SIZE; reference validTokens, expiredTokens, stats.skipped, BATCH_SIZE and
the skip variable to locate where to perform the deletion.
- Around line 76-88: The current grouping uses relationshipMap = new
Map(relationships.map((r) => [`${r.product_id}:${r.entity_id}`, r])) which drops
duplicates for the same product_id:entity_id; change this to build a Map<string,
Relationship[]> by pushing each relationship into an array keyed by
`${r.product_id}:${r.entity_id}` (use the existing prisma.relationship.findMany
result), then when processing use the array for each key to handle every
matching relationship instead of only the last one—update any downstream code
that reads relationshipMap to iterate the arrays for full processing.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts (3)

134-159: LGTM — deletion order and transaction usage are correct.

The FK-safe order (corpus → relationship → token → entity) inside prisma.$transaction is correct, and the per-token try/catch ensures one failure doesn't abort the entire batch.

One edge case worth being aware of: if the entity already has non-"new" relationships (e.g., a previously submitted form), tx.entity.delete at line 156 will fail on a FK constraint, rolling back the entire transaction. The token and entity are left intact (correct behavior — don't delete a live entity), but the error message "Error processing token …" gives no hint that this was the intentional guard. Consider adding a pre-check or a more descriptive catch message to distinguish this expected skip from an actual error.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 134 - 159,
The current per-token transaction may fail on tx.entity.delete when the entity
has non-"new" relationships, but the catch only logs a generic "Error processing
token…" message; add a pre-check using the relationshipsForToken array or a
query inside the transaction (e.g., count non-"new" relationships for
token.entityId) before attempting tx.entity.delete to detect this expected
condition and skip deletion, and/or update the catch block around
prisma.$transaction to log a distinct, descriptive message when the failure is
due to existing non-"new" relationships (referencing prisma.$transaction,
relationshipsForToken, tx.relationship.delete, and tx.entity.delete to locate
the code).

70-85: Minor perf: replace the per-token loop with a single deleteMany for tokensWithoutEntityId.

Up to BATCH_SIZE individual DELETE statements are issued sequentially when the batch contains tokens without entityId. A single deleteMany achieves the same result in one round-trip. If you still need per-token error visibility, the outer batch error count already provides aggregate signal.

⚡ Proposed refactor

-      if (tokensWithoutEntityId.length > 0) {
-        for (const token of tokensWithoutEntityId) {
-          try {
-            await prisma.publicFormsTokens.delete({
-              where: { token: token.token },
-            });
-            stats.tokensDeleted++;
-          } catch (error) {
-            stats.errors++;
-            console.error(
-              `Error deleting token without entityId ${token.token}:`,
-              error instanceof Error ? error.message : error
-            );
-          }
-        }
-      }
+      if (tokensWithoutEntityId.length > 0) {
+        try {
+          await prisma.publicFormsTokens.deleteMany({
+            where: { token: { in: tokensWithoutEntityId.map((t) => t.token) } },
+          });
+          stats.tokensDeleted += tokensWithoutEntityId.length;
+        } catch (error) {
+          stats.errors += tokensWithoutEntityId.length;
+          console.error(
+            `Error bulk-deleting ${tokensWithoutEntityId.length} tokens without entityId:`,
+            error instanceof Error ? error.message : error
+          );
+        }
+      }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 70 - 85,
Replace the per-token delete loop that iterates over tokensWithoutEntityId and
calls prisma.publicFormsTokens.delete for each token with a single bulk call
using prisma.publicFormsTokens.deleteMany; construct the where clause to match
the tokens (e.g. where: { token: { in: tokensWithoutEntityId.map(t => t.token)
}, entityId: null }) and, on success, increment stats.tokensDeleted by the
returned count (result.count); in the catch increment stats.errors and log the
error — remove the individual delete calls and loop around
prisma.publicFormsTokens.delete.

93-99: Tokens with entityId but null productId are silently skipped — add a diagnostic log.

validTokens only filters on t.entityId; tokens with a non-null entityId but null productId are included in the loop but will never find a match in relationshipMap (key null:<entityId> never resolves to a relationship). They get counted as skipped without any indication of the root cause, making future debugging harder.

🔍 Suggested addition

       if (validTokens.length === 0) {
         lastId = expiredTokens[expiredTokens.length - 1].id;
         continue;
       }
+
+      const tokensWithoutProductId = validTokens.filter((t) => !t.productId);
+      if (tokensWithoutProductId.length > 0) {
+        console.warn(
+          `Skipping ${tokensWithoutProductId.length} token(s) with entityId but no productId`
+        );
+      }

Additionally, entityIds.filter(Boolean) on line 98 is redundant — validTokens already guarantees t.entityId is truthy. The filter(Boolean) on productIds (line 95) is necessary and should stay.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 93 - 99,
The code silently skips tokens that have entityId but null productId; update the
productIds/entityIds collection so you log (via the existing logger) any token
where t.entityId is truthy but t.productId is falsy before they become
"skipped", and remove the redundant .filter(Boolean) on entityIds (since
validTokens guarantees entityId is present). Concretely: iterate validTokens,
push truthy productIds into productIds (keeping the existing filter for
productId), push t.entityId into entityIds without filtering, and when you
encounter t.entityId && !t.productId emit a diagnostic log mentioning the token
(or its id) so these cases are visible; keep using the same variables
productIds, entityIds and the relationshipMap lookup logic unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`:
- Around line 46-59: The current pagination uses a manual filter (id > lastId)
while ordering by [{ createdAt: "asc" }, { id: "asc" }], which can skip rows;
fix by making the ordering and cursor consistent: simplest fix — change the
query in prisma.publicFormsTokens.findMany to order only by id (orderBy: { id:
"asc"" }) and keep the filter ...(lastId ? { id: { gt: lastId } } : {}), or if
you must preserve createdAt ordering implement proper keyset pagination by
tracking the last row's composite (createdAt, id) and use a composite
cursor/where logic that advances by (createdAt > lastCreatedAt) OR (createdAt ==
lastCreatedAt AND id > lastId) while ordering by [{ createdAt: "asc" }, { id:
"asc" }]; update the variables lastId/lastCreatedAt and loop termination to
match the chosen approach.

---

Nitpick comments:
In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts`:
- Around line 134-159: The current per-token transaction may fail on
tx.entity.delete when the entity has non-"new" relationships, but the catch only
logs a generic "Error processing token…" message; add a pre-check using the
relationshipsForToken array or a query inside the transaction (e.g., count
non-"new" relationships for token.entityId) before attempting tx.entity.delete
to detect this expected condition and skip deletion, and/or update the catch
block around prisma.$transaction to log a distinct, descriptive message when the
failure is due to existing non-"new" relationships (referencing
prisma.$transaction, relationshipsForToken, tx.relationship.delete, and
tx.entity.delete to locate the code).
- Around line 70-85: Replace the per-token delete loop that iterates over
tokensWithoutEntityId and calls prisma.publicFormsTokens.delete for each token
with a single bulk call using prisma.publicFormsTokens.deleteMany; construct the
where clause to match the tokens (e.g. where: { token: { in:
tokensWithoutEntityId.map(t => t.token) }, entityId: null }) and, on success,
increment stats.tokensDeleted by the returned count (result.count); in the catch
increment stats.errors and log the error — remove the individual delete calls
and loop around prisma.publicFormsTokens.delete.
- Around line 93-99: The code silently skips tokens that have entityId but null
productId; update the productIds/entityIds collection so you log (via the
existing logger) any token where t.entityId is truthy but t.productId is falsy
before they become "skipped", and remove the redundant .filter(Boolean) on
entityIds (since validTokens guarantees entityId is present). Concretely:
iterate validTokens, push truthy productIds into productIds (keeping the
existing filter for productId), push t.entityId into entityIds without
filtering, and when you encounter t.entityId && !t.productId emit a diagnostic
log mentioning the token (or its id) so these cases are visible; keep using the
same variables productIds, entityIds and the relationshipMap lookup logic
unchanged.

coderabbitai · 2026-02-18T13:55:27Z

+    while (true) {
+      const expiredTokens = await prisma.publicFormsTokens.findMany({
+        where: {
+          createdAt: {
+            lte: sevenDaysAgo, // tokens older than or equal to 7 days
+          },
+          ...(lastId ? { id: { gt: lastId } } : {}), // cursor-based pagination
        },
-      },
-    });
+        orderBy: [
+          { createdAt: "asc" },
+          { id: "asc" }, // deterministic ordering, break ties with id
+        ],
+        take: BATCH_SIZE,
+      });


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "cleanup_unsubmitted_forms.ts" -type f

Repository: vula-africa/coding_tests

Length of output: 122

🏁 Script executed:

cat -n tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts | head -80

Repository: vula-africa/coding_tests

Length of output: 3752

🌐 Web query:

Prisma cursor pagination manual cursor filter orderBy multiple fields

💡 Result:

1) orderBy with multiple fields (stable ordering)

Use an orderBy array (or multi-key object) so ties on a non-unique field (like createdAt) are broken by a unique field (like id). This avoids duplicates/missing rows caused by unstable ordering. [1] [2]

orderBy: [{ createdAt: 'asc' }, { id: 'asc' }]

Prisma docs (example) show orderBy: [ { role: 'desc' }, { name: 'desc' } ]. [1]

2) Prisma “cursor” requires a unique selector

cursor is typed as ModelWhereUniqueInput, so it can only be a unique field (or a compound @@unique) — not arbitrary multi-column values. [3]

If you want a compound cursor, define a compound unique and use it as the cursor:

model Post { id String `@id` `@default`(cuid()) createdAt DateTime `@default`(now()) @@unique([createdAt, id]) // enables compound cursor input }

Then:

const page = await prisma.post.findMany({ take: 20, skip: 1, cursor: { createdAt_id: { createdAt: cursorCreatedAt, id: cursorId } }, orderBy: [{ createdAt: 'asc' }, { id: 'asc' }], })

(Prisma has an open request to support cursors containing all sort-column values without requiring a unique constraint.) [4]

3) “Manual cursor” pagination (cursor filter in where)

If you can’t (or don’t want to) add a compound @@unique, do keyset pagination by putting the cursor logic in where, while keeping the same multi-field orderBy:

ASC example (createdAt, then id):

const page = await prisma.post.findMany({ take: 20, where: cursor ? { OR: [ { createdAt: { gt: cursor.createdAt } }, { createdAt: cursor.createdAt, id: { gt: cursor.id } }, ], } : undefined, orderBy: [{ createdAt: 'asc' }, { id: 'asc' }], })

This matches the standard pattern (createdAt, id) > (cursorCreatedAt, cursorId) used to avoid ambiguity when createdAt is not unique. [2]

Why this matters

Using cursor: { id } while sorting by a non-unique field (like createdAt) can produce repeated results because the ordering isn’t stable unless you add a tie-breaker. [5] [2]

Sources: [1] Prisma “Filtering and sorting” docs (orderBy array example); [2] “Stable sorting in Prisma” (why tie-breakers matter); [3] Prisma client delegate reference (cursor is WhereUniqueInput); [4] Prisma issue proposing { createdAt, id } cursors; [5] Prisma issue reporting duplicates with cursor + orderBy createdAt.

🏁 Script executed:

# Check if publicFormsTokens has compound unique constraints rg "publicFormsTokens" tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts -A 5 -B 5

Repository: vula-africa/coding_tests

Length of output: 1246

🏁 Script executed:

# Search for schema definition find . -name "*.prisma" -type f 2>/dev/null

Repository: vula-africa/coding_tests

Length of output: 50

🏁 Script executed:

# Search for schema or model definitions in common locations find . -type f $ -name "*.prisma" -o -name "schema.prisma" -o -name "package.json" $ | head -20

Repository: vula-africa/coding_tests

Length of output: 50

🏁 Script executed:

# Look for any reference to publicFormsTokens schema or type definitions rg "publicFormsTokens|@@unique|@unique" --type typescript --type prisma -B 2 -A 2 | head -100

Repository: vula-africa/coding_tests

Length of output: 102

🏁 Script executed:

# Check the complete pagination loop and how lastId is used cat -n tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts | sed -n '40,120p'

Repository: vula-africa/coding_tests

Length of output: 3521

🏁 Script executed:

# Search for schema or prisma config files more broadly find . -type f -name "schema.*" -o -name "*.prisma" 2>/dev/null | head -20

Repository: vula-africa/coding_tests

Length of output: 50

🏁 Script executed:

# Check package.json or any documentation about the schema find . -type f $ -name "package.json" -o -name "README*" -o -name "SCHEMA*" $ | head -10

Repository: vula-africa/coding_tests

Length of output: 254

🏁 Script executed:

# Check if there are type definitions or any references to publicFormsTokens structure rg "publicFormsTokens" --max-count=20 -B 3 -A 3

Repository: vula-africa/coding_tests

Length of output: 1850

🏁 Script executed:

# Look at the complete function to understand context cat -n tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts | tail -80

Repository: vula-africa/coding_tests

Length of output: 3270

Critical: cursor filter id > lastId is incompatible with composite orderBy [createdAt, id] — tokens will be permanently skipped between batches.

Using a manual cursor filter id > lastId only works when the result set is ordered by id as the primary key. With orderBy: [{ createdAt: "asc" }, { id: "asc" }], the page boundary is determined by createdAt first. Example of the bug:

Batch 1 ends with createdAt id

(lastId) 2025-02-05 z…

Batch 2 should include createdAt id

(skipped by id > 'z') 2025-02-06 a…

The token with createdAt=2025-02-06 and id='a' appears after the previous batch in sorted order, but the filter id > 'z' excludes it since 'a' < 'z' lexicographically. This token is never deleted in the current run.

Fix: Use one of these approaches:

Option A — simplest: order by `id` only

const expiredTokens = await prisma.publicFormsTokens.findMany({ where: { createdAt: { lte: sevenDaysAgo }, ...(lastId ? { id: { gt: lastId } } : {}), }, - orderBy: [ - { createdAt: "asc" }, - { id: "asc" }, - ], + orderBy: { id: "asc" }, take: BATCH_SIZE, });

Option B — keyset pagination with composite cursor logic

+ let lastCreatedAt: Date | null = null; let lastId: string | null = null; while (true) { const expiredTokens = await prisma.publicFormsTokens.findMany({ where: { createdAt: { lte: sevenDaysAgo }, - ...(lastId ? { id: { gt: lastId } } : {}), + ...(lastCreatedAt && lastId ? { + OR: [ + { createdAt: { gt: lastCreatedAt } }, + { createdAt: lastCreatedAt, id: { gt: lastId } }, + ], + } : {}), }, orderBy: [ { createdAt: "asc" }, { id: "asc" }, ], take: BATCH_SIZE, }); if (expiredTokens.length === 0) break; stats.tokensFound += expiredTokens.length; + const lastToken = expiredTokens[expiredTokens.length - 1]; + lastCreatedAt = lastToken.createdAt; - lastId = expiredTokens[expiredTokens.length - 1].id; + lastId = lastToken.id;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

while (true) {

const expiredTokens = await prisma.publicFormsTokens.findMany({

where: {

createdAt: {

lte: sevenDaysAgo, // tokens older than or equal to 7 days

},

...(lastId ? { id: { gt: lastId } } : {}), // cursor-based pagination

},

},

});

orderBy: [

{ createdAt: "asc" },

{ id: "asc" }, // deterministic ordering, break ties with id

],

take: BATCH_SIZE,

});

while (true) {

const expiredTokens = await prisma.publicFormsTokens.findMany({

where: {

createdAt: {

lte: sevenDaysAgo, // tokens older than or equal to 7 days

},

...(lastId ? { id: { gt: lastId } } : {}), // cursor-based pagination

},

orderBy: { id: "asc" },

take: BATCH_SIZE,

});

Suggested change

while (true) {

const expiredTokens = await prisma.publicFormsTokens.findMany({

where: {

createdAt: {

lte: sevenDaysAgo, // tokens older than or equal to 7 days

},

...(lastId ? { id: { gt: lastId } } : {}), // cursor-based pagination

},

},

});

orderBy: [

{ createdAt: "asc" },

{ id: "asc" }, // deterministic ordering, break ties with id

],

take: BATCH_SIZE,

});

let lastCreatedAt: Date | null = null;

let lastId: string | null = null;

while (true) {

const expiredTokens = await prisma.publicFormsTokens.findMany({

where: {

createdAt: { lte: sevenDaysAgo },

...(lastCreatedAt && lastId ? {

OR: [

{ createdAt: { gt: lastCreatedAt } },

{ createdAt: lastCreatedAt, id: { gt: lastId } },

],

} : {}),

},

orderBy: [

{ createdAt: "asc" },

{ id: "asc" },

],

take: BATCH_SIZE,

});

if (expiredTokens.length === 0) break;

stats.tokensFound += expiredTokens.length;

const lastToken = expiredTokens[expiredTokens.length - 1];

lastCreatedAt = lastToken.createdAt;

lastId = lastToken.id;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts` around lines 46 - 59, The current pagination uses a manual filter (id > lastId) while ordering by [{ createdAt: "asc" }, { id: "asc" }], which can skip rows; fix by making the ordering and cursor consistent: simplest fix — change the query in prisma.publicFormsTokens.findMany to order only by id (orderBy: { id: "asc"" }) and keep the filter ...(lastId ? { id: { gt: lastId } } : {}), or if you must preserve createdAt ordering implement proper keyset pagination by tracking the last row's composite (createdAt, id) and use a composite cursor/where logic that advances by (createdAt > lastCreatedAt) OR (createdAt == lastCreatedAt AND id > lastId) while ordering by [{ createdAt: "asc" }, { id: "asc" }]; update the variables lastId/lastCreatedAt and loop termination to match the chosen approach.

fix: unsubmitted forms

5803563

hungaikev changed the title ~~fix: unsubmitted forms~~ fix: Correct date logic and relationship matching in unsubmitted forms cleanup job Feb 18, 2026

coderabbitai Bot reviewed Feb 18, 2026

View reviewed changes

Comment thread tests/unsubmitted_forms/cleanup_unsubmitted_forms.ts Outdated

feedback

c31bfd2

coderabbitai Bot reviewed Feb 18, 2026

View reviewed changes

fixes

d98562f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Correct date logic and relationship matching in unsubmitted forms cleanup job#27

fix: Correct date logic and relationship matching in unsubmitted forms cleanup job#27
hungaikev wants to merge 3 commits into
vula-africa:mainfrom
hungaikev:fix/forms

hungaikev commented Feb 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hungaikev commented Feb 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Issues Fixed

🔴 Critical Bugs

⚠️ Improvements

Changes

Testing

Performance Impact

Trade-offs

Related

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 18, 2026

Choose a reason for hiding this comment

1) orderBy with multiple fields (stable ordering)

2) Prisma “cursor” requires a unique selector

3) “Manual cursor” pagination (cursor filter in where)

Why this matters

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hungaikev commented Feb 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 18, 2026 •

edited

Loading

1) `orderBy` with multiple fields (stable ordering)

3) “Manual cursor” pagination (cursor filter in `where`)