feat(mwpw-184989): multi-page agent QA review with PR-aware routing by sanrai · Pull Request #535 · adobecom/caas

sanrai · 2026-06-23T23:36:50Z

Upgrades the agent QA review from a single hardcoded URL to a validated 5-page covering set, each exercising distinct CaaS code paths, with one combined comment.

A customer-success-stories (Left filter, OR semantics, search/sort/deep-link/bookmarks/chips/no-results)
B news.adobe.com (Top filter panel)
C adobe.com/events (event ordering + register/gating banner)
D adobe.com/max community (carousel)
E milo card-styles gallery (visual diff of all card styles)

PR-aware routing: a shared change (Card.jsx / Container / Helpers / any LESS) reviews all pages; a localized change reviews only the pages whose code paths it touches; a .github-only change runs a single smoke page. OR-filter semantics are encoded in the page A/B instructions so the agent does not flag the union-widening as a bug. a11y interaction checks deferred to a later iteration. Per-page agent time-capped; job timeout raised to 30 min.

github-actions · 2026-06-23T23:39:15Z

🤖 Agent QA review — multi-page, interactive + visual diff (advisory, non-blocking)

Reviewed 1 page(s) on the live site with the PR build injected. Overall: PASS.

PASS — A-left-hub · 0.00% pixels changed · https://business.adobe.com/customer-success-stories.html

QA Review Report — PR #535 (feat(mwpw-184989): multi-page agent QA review with PR-aware routing)

Target URL: https://business.adobe.com/customer-success-stories.html

Pre-captured Pixel Diff

0.00% pixels changed between PR build and stable. No magenta regions were present in the diff image — the PR introduces zero visual rendering changes to this page.

PR Code Changes Summary

This PR refactors the agent-review.mjs CI script from single-page to multi-page QA review:

Removes single hardcoded BASE_URL / CAASVER env vars
Introduces a PAGES array to drive per-page diff + review
Adds OR-filter documentation string (OR constant)
Minor code style cleanup (variable renames, comment rewrites)
No changes to any CaaS front-end code (no card, filter, search, sort, or pagination logic changed)

Live Page Observations

Initial Load

Page loaded successfully at https://business.adobe.com/customer-success-stories.html
Hero section with featured customer stories rendered correctly
Breadcrumb navigation (Home / Resource Center / Customer Success Stories) present and correct
Global nav (Adobe for Business, Products, AI, Industries, Roles, Resources, Support, Get Started, Sign In) all visible and properly rendered

Filter Panel

"Refine Your Results" left panel rendered correctly with "Clear" link
Result count displayed: 344 results — visible and correctly positioned
Sort dropdown "Date: (Newest To Oldest)" present and functional-looking
Three filter groups present: Products, Content Type, Industry — all collapsed by default (correct)
Search input box present with placeholder "Search Here" and clear (×) button

Filter Expansion Test

Clicked "Products" filter header → expanded correctly, revealing a full alphabetical list of product checkboxes (Acrobat, Acrobat Sign, Advertising, Analytics, Campaign, Commerce, Connect, Creative Cloud, Customer Journey Analytics, Experience Manager Assets/Forms/Guides/Sites, GenStudio for…)
Checkboxes rendered with correct styling, no visual overflow or truncation
No console errors observed during interaction

Card Grid

Cards rendered in a 2-column grid layout (correct for left-filter pages)
Card images, titles, descriptions, and CTA buttons ("Read article", "Read now") all present and correctly aligned
Cards visible: Walt Disney Imagineering, Genpact, Best Buy, Lumen — all properly structured

Issues Found

None. The PR makes no changes to any front-end CaaS component. The 0.00% pixel diff confirms no visual regression. The page loads, filters expand, cards render, and the result count displays correctly.

Verdict

The PR is a CI/tooling-only change (agent-review.mjs refactor for multi-page QA support). It has no impact on the rendered customer-success-stories page. All visible UI elements — filter panel, search, sort, card grid, result count — appear correct and unbroken.

Per-page screenshots + diffs in the workflow run.

github-actions · 2026-06-23T23:40:38Z

AI Code Review

Critical Issues

1. Sequential page loop with shared CDP browser — race condition / connection leak

In captureDiff, for each page the code calls chromium.connectOverCDP(CDP) and then browser.close(). The comment from the original code explains why this matters:

"Disconnect THIS CDP client (does not kill the persistent Chrome) so the qa-runner subprocess is the sole Playwright client — two concurrent clients on one browser cause navigation timeouts and hangs."

However, in the new loop:

for (const page of selected) {
  const pct = await captureDiff(page.url, page.id);   // closes CDP client
  // ... then spawnSync qa-runner ...                  // qa-runner connects
  // ... qa-runner finishes ...
  // next iteration: captureDiff connects again
}

This is sequential so the logic is preserved — but browser.close() on a CDP-connected browser in Playwright disconnects the client AND may terminate the underlying browser process (behavior varies by Playwright version). If the persistent Chrome is killed after the first page, subsequent iterations fail silently (caught by try/catch), producing pct = 'n/a' for all remaining pages with no screenshots. This was always a latent risk in the original but is now multiplied across N pages.

Fix: Use browser.disconnect() instead of browser.close() to safely detach without killing the persistent Chrome, consistent with the original intent expressed in the comment (which was deleted in this PR).

2. AGENT_TIMEOUT_MS reduced from 600 s to 180 s per page, but up to 5 pages can run

With selected = PAGES (shared file change), 5 pages × 180 s = 900 s = 15 minutes of agent time, plus capture overhead. The workflow timeout-minutes was only raised to 30 (1800 s). This is borderline but likely fine for the common case — however, if any spawnSync times out and emits SIGKILL, the subprocess doesn't clean up its CDP connection, potentially leaving the browser in a bad state for subsequent iterations.

3. `REPORT_OUT` shadowed inside loop — `unlinkSync` on previous iteration's file

const REPORT_OUT = `/tmp/pr-review-${page.id}.json`;
try { unlinkSync(REPORT_OUT); } catch {}

Each iteration uses a unique filename, so no actual collision. This is fine. (Not an issue — noting for completeness.)

4. `spawnSync` with `stdio: 'inherit'` inside a `for` loop — zombie Chrome connections

spawnSync is synchronous and blocks. If qa-runner-v2.mjs connects to CDP and then crashes (non-timeout), the Chrome session may be left with an open page. The next captureDiff call then gets browser.contexts()[0] which could be a stale/broken context from the crashed runner rather than a clean one.

Fix: After each spawnSync, consider navigating or closing any lingering pages before the next captureDiff connects, or always create a fresh context (browser.newContext()) rather than reusing contexts()[0].

Summary

Severity	Issue
High	`browser.close()` may kill the persistent Chrome; use `browser.disconnect()`
Medium	Crashed/killed qa-runner leaves stale CDP context; next `captureDiff` reuses it

… gallery instr

github-actions · 2026-06-24T00:05:23Z

AI Code Review

Critical Issues

1. Sequential CDP connections cause browser resource exhaustion / hangs

In captureDiff(), you call chromium.connectOverCDP(CDP) and then browser.close() for each page in the loop. The comment in the old code explicitly warned that two concurrent CDP clients cause timeouts, but now you're serially connecting/disconnecting across multiple pages. The browser.close() call on a CDP-connected browser (as opposed to a browser launched by Playwright) typically kills the underlying browser process, not just disconnects the client. Subsequent iterations will fail to connect since the persistent Chrome is gone.

Fix: Connect once before the loop, disconnect (not close) after, or use browser.disconnect() instead of browser.close().

2. `spawnSync` timeout is per-page at 180s but overall job timeout is 30min

With up to 5 pages × (180s agent + ~37s capture) ≈ ~18 minutes just for agents, plus capture overhead and sequential execution, you're cutting it very close to the 30-minute job timeout. If any single page's capture hangs (e.g., waitForTimeout(22000) for page E plus 45s goto timeout × 2 shots), the job will be killed mid-run with no comment posted—silent failure, no PR comment.

Fix: Add a top-level try/finally that always posts whatever sections have been collected, or reduce per-page timeouts.

3. `AGENT_TIMEOUT_MS` env var read inside the loop but set once externally

timeout: Number(env('AGENT_TIMEOUT_MS', '180000')),

The default changed from 600000ms (10 min) to 180000ms (3 min) per page. If the env var is set to the old value (600s) expecting a single agent run, all pages will now each get 600s = potentially 50+ minutes total, guaranteeing a job timeout. This is a behavioral change with no guard—callers using the old env var will silently get the wrong behavior.

4. Data loss: `changed` variable computed from `meta.files` paths only—no content

const changed = (meta.files || []).map((f) => f.path).join('\n');

The triggers regex is tested against changed (file paths), but Card\.jsx would only match if the literal string Card.jsx appears in the path. Since most paths are like src/js/Cards/Card.jsx, this works—but Helpers\/ with an escaped slash will never match a file path string because the path uses / not \/. The regex /Helpers\//i tested against a plain string with forward slashes works fine in JS (no escaping needed), but Helpers\/ in the regex with an escaped forward slash still matches a literal /—this is actually okay in JS. However, SHARED regex /Card\.jsx|Container\.jsx|Helpers\/|app\.jsx|\.less/i and page triggers will silently produce wrong selected if meta.files is empty or the API call fails (caught with empty catch {}), falling through to selected = [PAGES[0]] smoke only—potentially missing all relevant pages with no warning logged.

5. Race condition: `REPORT_OUT` path collision if pages have similar IDs

Not a race here since it's sequential, but page IDs must be filesystem-safe. The current IDs (A-left-hub, etc.) contain hyphens which are fine, but this is fragile if IDs are ever parameterized from untrusted input (e.g., PR branch names injected elsewhere).

Most Critical Action Items:

Replace browser.close() with browser.disconnect() in captureDiff() to avoid killing the persistent Chrome between iterations.
Add a top-level try/finally to post partial results if the loop is interrupted before completion.

…cade

github-actions · 2026-06-24T00:51:31Z

AI Code Review

Issues Found

1. Sequential per-page agent runs may silently starve on timeout

Each page now runs spawnSync with AGENT_TIMEOUT_MS defaulting to 180000 ms (3 min). With up to 5 pages selected, total wall time can reach ~15+ minutes (5 × 3 min agents + capture time), but the job timeout-minutes is only 30. If isShared is true, all 5 pages run. Late pages will be killed mid-run by the workflow timeout with no comment posted at all — not even a partial result — because sections won't accumulate enough entries to post the combined comment.

Fix: Either cap selected pages, or ensure the comment is posted even on early exit (e.g., move the comment-posting into a finally block or post after each page).

2. CDP browser left connected across sequential `captureDiff` calls

captureDiff calls browser.close() at the end, but only inside the try block. If an exception is thrown after connectOverCDP but before browser.close() (e.g., during screenshot), the browser connection leaks. More critically, qa-runner-v2.mjs also connects via CDP immediately after. With 5 pages, each loop iteration calls captureDiff then spawnSync(qa-runner-v2.mjs) — if captureDiff throws partway, the orphaned CDP client may block the runner's connection.

Fix: Use a try/finally to guarantee browser.close().

3. `PNG.bitblt` source bounds are not validated — potential silent data corruption

In the non-segmented path, PNG.bitblt(a, ca, 0, 0, w, Ht, 0, 0) copies w×Ht pixels from a, but a's actual dimensions may differ from w (which is Math.min(a.width, b.width)). While w is min of widths, if a.width > w the blit reads from a larger source into a smaller destination buffer — this is the same bug as the original code, but it's still present. The segmented path has the same issue with bh.

This is low-risk in practice since pixelmatch uses the destination buffer size, but PNG.bitblt can throw or corrupt if source region extends past source bounds.

4. `changed` variable used before the `diff` try/catch completes

let diff = ''; try { diff = gh(['pr', 'diff', PR, '-R', REPO]); } catch {}
const changed = (meta.files || []).map((f) => f.path).join('\n');

changed depends on meta.files, not diff, so this is fine — but SHARED.test(changed) and p.triggers.test(changed) test against the newline-joined file paths. If changed is empty (e.g., API failure returned empty files), selected falls back to [PAGES[0]], which is correct. Not a bug, but worth noting the fallback is silent.

5. No deduplication of diff images between pages when all 5 pages run

Each captureDiff writes ${OUT}/${tag}-pr.png and ${OUT}/${tag}-stable.png. Tags are unique per page (page.id), so no collision — not a bug. However, for the gallery page with wait: 22000, the sequential loop adds significant time that is not accounted for in the timeout analysis from issue #1.

Highest priority fix: Wrap the for loop in a try/finally to post sections (even if partial) on any early exit, and ensure browser.close() is in a finally inside captureDiff.

…enting, lower turn budget

github-actions · 2026-06-24T06:55:05Z

AI Code Review

Critical Issues

1. CDP Resource Leak / Broken Isolation Between Pages

In captureDiff, a new browser context is created (or the existing one reused) via:

const c = browser.contexts()[0] || (await browser.newContext());

At the end of the function, browser.close() is called. However, this function is called in a loop (for (const page of selected)). On the second iteration, chromium.connectOverCDP(CDP) reconnects to the same persistent Chrome instance — but the comment in the old code notes that browser.close() "does not kill the persistent Chrome." The concern here is correctness: if browser.close() does close the context/pages created within it, the CDP connection state between iterations may be inconsistent. More critically, the page routes registered via p.route() are scoped to the page, but the context (c) may be the shared contexts()[0]. If a context is reused across calls and not properly cleaned up, route handlers from one page iteration could leak into the next, causing the wrong build to be injected on subsequent pages.

2. Per-Page Timeout Drastically Reduced — Likely to Cause Silent Failures

The per-page agent timeout was changed from 600000ms (10 min) to 120000ms (2 min), while the number of pages can be up to 5 (all pages if shared files changed). The agent instructions for page A alone specify: apply filter, search, paginate, check URL, bookmark, click card — plus load screenshot, run axe, get console errors. 2 minutes is almost certainly insufficient for this scope.

The fallback when timed out is a partial report with verdict = 'UNKNOWN', which rolls into overall = 'MIXED' (not FAIL). This means real regressions discovered during a partial run will be silently downgraded from FAIL to MIXED.

3. `spawnSync` with `stdio: 'inherit'` in a Loop — Blocks but No Error Surfacing

spawnSync is used correctly for sequential execution, but run.error is only checked for ETIMEDOUT. If the subprocess fails for other reasons (missing module, JS error, non-zero exit), the error is silently swallowed — report stays as '(no report produced)' and verdict stays 'UNKNOWN'. This masks broken runner invocations.

Fix: Check run.status !== 0 && !timedOut and log/report accordingly.

4. Workflow Timeout May Still Be Insufficient

With 5 pages × (15s wait × 2 shots + 2s + diff computation + 120s agent) ≈ ~12 min just for the happy path, plus GitHub API calls, the 30-minute job timeout is tight and could cause the workflow to be killed before posting the comment, resulting in no PR feedback at all (silent failure).

feat(mwpw-184989): multi-page agent qa review with pr-aware page routing

eec64e0

sanrai temporarily deployed to github-pages June 23, 2026 23:38 — with GitHub Actions Inactive

fix(mwpw-184989): full-page diff for gallery, more turns, visual-only…

4b90efe

… gallery instr

sanrai temporarily deployed to github-pages June 24, 2026 00:05 — with GitHub Actions Inactive

fix(mwpw-184989): segment gallery diff into bands, prime agent on cas…

ca452d8

…cade

sanrai had a problem deploying to github-pages June 24, 2026 00:48 — with GitHub Actions Failure

perf(mwpw-184989): scope filter selectors per page, drop gallery segm…

08ddc5d

…enting, lower turn budget

sanrai temporarily deployed to github-pages June 24, 2026 06:55 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(mwpw-184989): multi-page agent QA review with PR-aware routing#535

feat(mwpw-184989): multi-page agent QA review with PR-aware routing#535
sanrai wants to merge 4 commits into
mainfrom
sanrai/agent-multipage

sanrai commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 •

edited

Loading

QA Review Report — PR #535 (feat(mwpw-184989): multi-page agent QA review with PR-aware routing)

Target URL: https://business.adobe.com/customer-success-stories.html

Pre-captured Pixel Diff

PR Code Changes Summary

Live Page Observations

Issues Found

Verdict

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

sanrai commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Agent QA review — multi-page, interactive + visual diff (advisory, non-blocking)

QA Review Report — PR #535 (feat(mwpw-184989): multi-page agent QA review with PR-aware routing)

Target URL: https://business.adobe.com/customer-success-stories.html

Pre-captured Pixel Diff

PR Code Changes Summary

Live Page Observations

Issues Found

Verdict

Uh oh!

github-actions Bot commented Jun 23, 2026

AI Code Review

Critical Issues

1. Sequential page loop with shared CDP browser — race condition / connection leak

2. AGENT_TIMEOUT_MS reduced from 600 s to 180 s per page, but up to 5 pages can run

3. REPORT_OUT shadowed inside loop — unlinkSync on previous iteration's file

4. spawnSync with stdio: 'inherit' inside a for loop — zombie Chrome connections

Summary

Uh oh!

github-actions Bot commented Jun 24, 2026

AI Code Review

Critical Issues

1. Sequential CDP connections cause browser resource exhaustion / hangs

2. spawnSync timeout is per-page at 180s but overall job timeout is 30min

3. AGENT_TIMEOUT_MS env var read inside the loop but set once externally

4. Data loss: changed variable computed from meta.files paths only—no content

5. Race condition: REPORT_OUT path collision if pages have similar IDs

Most Critical Action Items:

Uh oh!

github-actions Bot commented Jun 24, 2026

AI Code Review

Issues Found

1. Sequential per-page agent runs may silently starve on timeout

2. CDP browser left connected across sequential captureDiff calls

3. PNG.bitblt source bounds are not validated — potential silent data corruption

4. changed variable used before the diff try/catch completes

5. No deduplication of diff images between pages when all 5 pages run

Uh oh!

github-actions Bot commented Jun 24, 2026

AI Code Review

Critical Issues

1. CDP Resource Leak / Broken Isolation Between Pages

2. Per-Page Timeout Drastically Reduced — Likely to Cause Silent Failures

3. spawnSync with stdio: 'inherit' in a Loop — Blocks but No Error Surfacing

4. Workflow Timeout May Still Be Insufficient

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 23, 2026 •

edited

Loading

3. `REPORT_OUT` shadowed inside loop — `unlinkSync` on previous iteration's file

4. `spawnSync` with `stdio: 'inherit'` inside a `for` loop — zombie Chrome connections

2. `spawnSync` timeout is per-page at 180s but overall job timeout is 30min

3. `AGENT_TIMEOUT_MS` env var read inside the loop but set once externally

4. Data loss: `changed` variable computed from `meta.files` paths only—no content

5. Race condition: `REPORT_OUT` path collision if pages have similar IDs

2. CDP browser left connected across sequential `captureDiff` calls

3. `PNG.bitblt` source bounds are not validated — potential silent data corruption

4. `changed` variable used before the `diff` try/catch completes

3. `spawnSync` with `stdio: 'inherit'` in a Loop — Blocks but No Error Surfacing