fix(github_graphql): paginate reviews beyond 100 and fix incremental sync sort order (DPROD-1260)#93
Conversation
…sync sort order (DPROD-1260) Two bugs prevented complete PR review collection in the github_graphql plugin: 1. Sort order was CREATED_AT but the incremental stop condition used UpdatedAt, causing missed updates for old PRs edited recently. Changed to UPDATED_AT DESC. 2. The reviews(first: 100) query silently truncated PRs with >100 reviews. Added fetchRemainingReviews() to follow PageInfo.hasNextPage cursors, and applied it in both ResponseParsers (new PRs + OPEN PR refresh). Added unit tests covering the pagination logic and stop-condition edge cases, and an e2e test verifying the extractor correctly filters PENDING reviews. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Review Summary by QodoFix PR review pagination and incremental sync sort order
WalkthroughsDescription• Fixed sort order mismatch: changed PR query from CREATED_AT to UPDATED_AT DESC to align incremental sync stop condition with cursor-based early exit • Added fetchRemainingReviews() function to paginate reviews beyond 100 per PR using GraphQL cursor pagination • Applied review pagination in both collection passes: new PRs and OPEN PR refresh • Added 11 unit tests covering pagination logic, stop conditions, and error handling • Added e2e test verifying PENDING reviews are correctly filtered by extractor Diagramflowchart LR
A["PR Query<br/>CREATED_AT DESC"] -->|"Changed to"| B["PR Query<br/>UPDATED_AT DESC"]
B -->|"Aligns with"| C["Incremental Sync<br/>Stop Condition"]
D["Reviews first: 100"] -->|"Truncates data"| E["Missing Reviews"]
D -->|"Fixed by"| F["fetchRemainingReviews<br/>Cursor Pagination"]
F -->|"Applied in"| G["New PRs Pass"]
F -->|"Applied in"| H["OPEN PR Refresh"]
G -->|"Produces"| I["Complete Review Data"]
H -->|"Produces"| I
File Changes1. backend/plugins/github_graphql/tasks/pr_collector.go
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #93 +/- ##
=======================================
Coverage 59.41% 59.41%
=======================================
Files 81 81
Lines 6105 6105
=======================================
Hits 3627 3627
Misses 2403 2403
Partials 75 75
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
this PR should be updated according to #92 to track a divergence from upstream. |
| } | ||
|
|
||
| // fetchRemainingReviews pages through reviews beyond the first 100 for a PR. | ||
| func fetchRemainingReviews(client graphqlQuerier, owner, repo string, prNumber int, startCursor string) ([]GraphqlQueryReview, errors.Error) { |
There was a problem hiding this comment.
Asking for learning purposes, is there any possibility of rate limit?
| assert.Equal(t, 1, mock.call) | ||
| } | ||
|
|
||
| func TestFetchRemainingReviews_TwoPages(t *testing.T) { |
There was a problem hiding this comment.
Should we check more than two pages? just to check multiple iterations
| Prs []GraphqlQueryPr `graphql:"nodes"` | ||
| TotalCount graphql.Int | ||
| } `graphql:"pullRequests(first: $pageSize, after: $skipCursor, orderBy: {field: CREATED_AT, direction: DESC})"` | ||
| } `graphql:"pullRequests(first: $pageSize, after: $skipCursor, orderBy: {field: UPDATED_AT, direction: DESC})"` |
There was a problem hiding this comment.
Do we want to update an exiting test or add a test to ensure this behavior? Or isn't relevant? (asking for learning purposes)

Summary
Fixes two bugs in
github_graphqlPR collection that caused incomplete review data:pullRequestswas ordered byCREATED_ATbut the incremental sync stop condition comparesUpdatedAt. Switched toUPDATED_AT DESCso the cursor-based early exit fires correctly and recently-edited old PRs are not skipped.reviews(first: 100)silently dropped reviews beyond 100 per PR. AddedfetchRemainingReviews()which followsPageInfo.hasNextPagecursor pagination, applied in both collection passes (new PRs and OPEN PR refresh).Tests
pr_collector_test.go): coverfetchRemainingReviewspagination (single page, multi-page, nil PageInfo, query error), the incremental sync stop condition (nil since, stop-immediately, partial, empty list), and the review-merge guard.e2e/pr_extractor_test.go): loads a raw PR fixture with 5 reviews (4 non-PENDING + 1 PENDING), runsExtractPrsMeta, and verifies exactly 4 rows in_tool_github_pull_request_reviews(PENDING filtered by extractor).Test plan
go test ./plugins/github_graphql/tasks/... -vgo test ./plugins/github_graphql/e2e/... -v(requires MySQLlake_test)Closes DPROD-1260
🤖 Generated with Claude Code