feat: add site-scoped preflight REST endpoints | SITES-44686#2478
feat: add site-scoped preflight REST endpoints | SITES-44686#2478GeezerPelletier wants to merge 19 commits into
Conversation
Implements three site-scoped preflight endpoints as replacements for the internal /preflight/beta/* endpoints: - POST /sites/:siteId/preflights — creates a Preflight + AsyncJob, fires Mysticat analyze, returns 202 with Location header - GET /sites/:siteId/preflights — lists all preflights for a site (optional ?url filter) - GET /sites/:siteId/preflights/:preflightId — returns full preflight detail Also adds: - PreflightDto (toJSON list shape + toDetailJSON detail shape) - preflightId UUID path-param validation in src/index.js - Access control via AccessControlUtil.hasAccess on all three endpoints - Removes /preflight/beta/* endpoints (replaced by above); /preflight/jobs endpoints are untouched Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
This PR will trigger a minor release when merged. |
…ospective feedback - Moves the Data Model section from ADR-002 into a standalone ADR-003 (003-preflight-entity-design.md) per ravverma's retrospective review feedback that the entity design warrants its own decision record - ADR-002 retains a short reference to ADR-003 and notes the @ekdogan alignment - Strengthens the ADR-002 list endpoint rationale: documents why no pagination cap is needed given the dedicated Preflight table, human-triggered volume, 7-day TTL, and lightweight payload - ADR-003 adds the dual-store boundary note: legacy AsyncJob records from /preflight/jobs are not surfaced through the new GET endpoints by design Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolves conflict between ADR-002 Data Model section extraction (our branch) and the TTL correction landed in main (#2477). ADR-002 keeps the short reference to ADR-003; the ON DELETE CASCADE expiry correction is applied directly into ADR-003 where the entity design now lives. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Note on CI failures: The failing checks in this PR are pre-existing failures on |
clotton
left a comment
There was a problem hiding this comment.
Strong overall — the ADR is excellent, cross-site isolation is done right (404 not 403 in getPreflightById), createdBy is derived from the authenticated profile rather than client input, and mode: 'suggest' is correctly pinned server-side. One shippable bug and a few smaller concerns below.
🚨 Critical — ?url= filter never works in production
src/controllers/preflight.js:532 reads urlFilter from context.params?.url. context.params is populated by matchPath from path segments — the route GET /sites/:siteId/preflights has no :url segment, so params.url is always undefined. Query strings live at context.invocation.event.rawQueryString and need to be parsed explicitly (see feature-flags.js:40-49 and brands.js:129 for the pattern).
The existing test at preflight.test.js:1418 doesn't catch this because it injects params: { url } directly, bypassing the routing layer.
Suggested fix: add a getQueryParams(context) helper, pull url from there, and add a test that constructs context.invocation.event.rawQueryString = 'url=...'.
Concerns worth addressing
- Mysticat error messages leak to API clients (line 515). The Mysticat response body (caught at 361-364 in the unchanged helper) can contain stack traces / internal IDs. Log the full error; return a generic upstream error to clients.
- Silent rollback failures at lines 493, 512, 514 —
.catch(() => {})hides DB inconsistencies from ops. Use.catch((e) => log.warn(...)). - No timeout on Mysticat fetch +
env.MYSTIQUE_API_BASE_URLnot validated (line 499 / call site). If Mysticat hangs, this request hangs until Lambda timeout. If the env var is unset, fetch URL becomes"undefined/v1/preflight/analyze". Add anAbortController(~10-15s) and validate required env vars at controller construction. - Status enum inconsistency —
AsyncJob.Status.IN_PROGRESS(line 469) vs raw string'IN_PROGRESS'(line 487). - Silent
findByPreviewURLerror swallow at line 405. Behavior is correct (falls through to!siteByUrl), but alog.debugwould give ops visibility if this starts firing. - Synchronous HEAD to customer URL in
checkEnableAuthentication(unchanged helper) — pre-existing pattern, also in legacycreatePreflightJob. Not a regression, but worth a follow-up to bound it.
Strengths
- ADR-003 is genuinely well-written — clearly states why a dedicated
Preflightentity beats extendingAsyncJob, and was pre-aligned with @ekdogan. This is what an ADR should look like. - Authorization ordering (
!site → !hasAccess → URL validation) is correct — never disclose data before access verification. preflightIdUUID validation centralized in the router (index.js:356) follows the existingjobId/siteIdpattern.- Required capabilities registered for all three new routes with the right read/write semantics.
- 61 tests with broad coverage of error paths (access denied, not found, internal errors, entitlement gate).
Happy to discuss any of these — the URL-filter fix is the only one I'd consider a hard block.
| */ | ||
| const getAllPreflights = async (context) => { | ||
| const siteId = context.params?.siteId; | ||
| const urlFilter = context.params?.url; |
There was a problem hiding this comment.
🚨 context.params?.url is always undefined for this route.
context.params is populated by matchPath from path segments — GET /sites/:siteId/preflights has no :url segment, so query strings never land here. The test at preflight.test.js:1418 doesn't catch this because it injects params: { url } directly.
Query strings need to be parsed from context.invocation.event.rawQueryString — see feature-flags.js:40 or brands.js:129 for the established pattern.
Suggested fix:
function getQueryParams(context) {
const raw = context.invocation?.event?.rawQueryString;
if (!raw) return {};
const out = {};
raw.split('&').forEach((p) => {
const [k, v] = p.split('=');
if (k) out[decodeURIComponent(k)] = v !== undefined ? decodeURIComponent(v) : '';
});
return out;
}
// in getAllPreflights:
const urlFilter = getQueryParams(context).url;And add a test that constructs a real rawQueryString so this can't regress.
There was a problem hiding this comment.
Fixed - getAllPreflights now parses from context.invocation.event.rawQueryString following the pattern from feature-flags.js and brands.js. The test has also been updated to construct a real rawQueryString with URL-encoded values so this can't regress.
| await preflight.save().catch(() => {}); | ||
| asyncJob.setStatus(AsyncJob.Status.FAILED); | ||
| await asyncJob.save().catch(() => {}); | ||
| return preflightError('PREFLIGHT_UPSTREAM_ERROR', `Mysticat analyze failed: ${mysticatError.message}`, 502); |
There was a problem hiding this comment.
mysticatError.message may include the response body from Mysticat (see line 363 in the unchanged helper), which could contain stack traces, internal IDs, or other internal info we don't want to leak to API clients. Suggest logging the full error and returning a generic upstream message:
log.error(`Mysticat analyze failed: ${mysticatError.message}`);
return preflightError('PREFLIGHT_UPSTREAM_ERROR', 'Upstream analyze service failed', 502);There was a problem hiding this comment.
Fixed - the full Mysticat error is logged but the client response now returns the generic 'Upstream analyze service failed' message so internal details aren't leaked.
| }); | ||
| } catch (e) { | ||
| log.error(`Failed to create Preflight: ${e.message}`); | ||
| await asyncJob.remove().catch(() => {}); |
There was a problem hiding this comment.
.catch(() => {}) silently swallows the rollback failure. If asyncJob.remove() fails here, the AsyncJob is orphaned and ops has no visibility. The TTL will eventually clean it up, but a .catch((e) => log.warn(\Failed to roll back AsyncJob ${asyncJob.getId()}: ${e.message}`))` would surface drift.
There was a problem hiding this comment.
Fixed - rollback now uses .catch((re) => log.warn(Failed to roll back AsyncJob : `))` so orphaned jobs surface in logs.
| preflight.setStatus('FAILED'); | ||
| preflight.setError({ code: 'MYSTICAT_ERROR', message: mysticatError.message }); | ||
| preflight.setEndedAt(new Date().toISOString()); | ||
| await preflight.save().catch(() => {}); |
There was a problem hiding this comment.
Same silent-rollback concern at lines 512 and 514 — if either preflight.save() or asyncJob.save() fails while persisting the FAILED state, the DB ends up inconsistent and we never log it. A .catch((e) => log.warn(...)) on each would help operability.
There was a problem hiding this comment.
Fixed - both preflight.save() and asyncJob.save() now have .catch((e) => log.warn(...)) so inconsistent DB state is surfaced in logs rather than swallowed.
| } | ||
| try { | ||
| await callMysticatAnalyze( | ||
| env.MYSTIQUE_API_BASE_URL, |
There was a problem hiding this comment.
Two things on this call:
-
No timeout on the fetch inside
callMysticatAnalyze. If Mysticat hangs, this request hangs until Lambda's invocation timeout. Suggest adding anAbortControllerwith ~10-15s incallMysticatAnalyzeso caller can surface upstream timeouts cleanly. -
env.MYSTIQUE_API_BASE_URLnot validated. If unset, the URL becomes"undefined/v1/preflight/analyze"and fetch fails with a confusing error. The constructor already validatesenvis a non-empty object — recommend adding a specific check forMYSTIQUE_API_BASE_URL(and any other required env keys) at controller construction so misconfiguration fails loud at startup, not silently mid-request.
There was a problem hiding this comment.
Fixed both points:
- Added
AbortControllerwith a 15s timeout incallMysticatAnalyze, cleared in afinallyblock so it doesn't leak. MYSTIQUE_API_BASE_URLvalidation moved intocreatePreflightat call time (returns 500) rather than the constructor, sogetAllPreflightsandgetPreflightByIdcan still be instantiated without it - the misconfiguration fails loud when it matters.
| siteId, | ||
| asyncJobId: asyncJob.getId(), | ||
| url, | ||
| status: 'IN_PROGRESS', |
There was a problem hiding this comment.
Status string mismatch with line 469 — there it's AsyncJob.Status.IN_PROGRESS (typed enum), here it's the raw string 'IN_PROGRESS'. If Preflight has its own Status enum, use it for consistency. If not, worth adding one to spacecat-shared-data-access so we don't end up with typos like 'IN_PROGESS' later.
There was a problem hiding this comment.
Fixed - Preflight.create now uses AsyncJob.Status.IN_PROGRESS consistently throughout, matching line 469.
| try { | ||
| siteByUrl = await dataAccess.Site.findByPreviewURL(previewBaseURL); | ||
| } catch (e) { | ||
| // URL doesn't match a known site — fall through to invalid request |
There was a problem hiding this comment.
Silent error swallow — if findByPreviewURL throws (vs. returns null), the subsequent !siteByUrl check handles it correctly, so the behavior is fine. But adding a log.debug(\findByPreviewURL failed for ${previewBaseURL}: ${e.message}`)` would help ops spot if this starts firing unexpectedly.
There was a problem hiding this comment.
Fixed - added log.debug(findByPreviewURL failed for : `)` in the catch block so unexpected failures surface in logs.
- Fix getAllPreflights url filter: parse from rawQueryString not context.params (query strings never land in params; tests updated to use real rawQueryString) - Validate MYSTIQUE_API_BASE_URL at call time in createPreflight; return 500 if unset - Add AbortController (15s) to callMysticatAnalyze so hung Mysticat calls surface cleanly - Log debug on findByPreviewURL catch so unexpected failures are visible in ops - Use AsyncJob.Status.IN_PROGRESS consistently on Preflight.create (was raw string) - Add log.warn on asyncJob.remove() rollback failure instead of silent swallow - Add log.warn on preflight/asyncJob FAILED-state save failures instead of silent swallow - Return generic 'Upstream analyze service failed' message on 502 (was leaking Mysticat internals) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
||
| const step = data.step.toLowerCase(); | ||
| const { url } = data; | ||
| const previewBaseURL = `${new URL(url).protocol}//${new URL(url).hostname}`; |
There was a problem hiding this comment.
Nit: new URL(url) is parsed twice here. Parse once:
const { protocol, hostname } = new URL(url);
const previewBaseURL = `${protocol}//${hostname}`;There was a problem hiding this comment.
Fixed - destructuring on the first parse: const { protocol, hostname } = new URL(url); const previewBaseURL = \\\//\\\\.
| try { | ||
| siteByUrl = await dataAccess.Site.findByPreviewURL(previewBaseURL); | ||
| } catch (e) { | ||
| log.debug(`findByPreviewURL failed for ${previewBaseURL}: ${e.message}`); |
There was a problem hiding this comment.
If findByPreviewURL throws (e.g. a DB error), the exception is swallowed at debug level and execution falls through to the if (!siteByUrl || ...) check below, which returns 400 PREFLIGHT_INVALID_REQUEST — "URL does not belong to this site" — when the real cause is an infrastructure fault.
Suggest treating the throw as a 500:
} catch (e) {
log.error(`findByPreviewURL failed for ${previewBaseURL}: ${e.message}`);
return preflightError('PREFLIGHT_INTERNAL_ERROR', 'Internal error', 500);
}There was a problem hiding this comment.
Good catch - fixed. A throw from findByPreviewURL now returns PREFLIGHT_INTERNAL_ERROR 500 rather than falling through to the !siteByUrl check and returning a misleading 400. Added a test for the throw case as well.
| url, | ||
| status: AsyncJob.Status.IN_PROGRESS, | ||
| createdBy, | ||
| startedAt: new Date().toISOString(), |
There was a problem hiding this comment.
startedAt is set here, before the callMysticatAnalyze call below. ADR-003 defines it as "When processing began", but actual processing begins when Mysticat accepts the POST. If Mysticat ever queues requests (or is slow to respond), the recorded startedAt will be earlier than real processing. Worth moving this to after callMysticatAnalyze succeeds, or clarifying in ADR-003 that it means "when the request was dispatched".
There was a problem hiding this comment.
Fair point on the semantics. startedAt is set at record creation because the Preflight row must exist before we call Mysticat (so we can persist FAILED state if it throws). Moving it to post-call would require a second save on the happy path. Clarified the field description in ADR-003 to read: ''When the request was dispatched to the analyze service (set at record creation, before the upstream call completes)''.
| const siteId = context.params?.siteId; | ||
| const rawQueryString = context.invocation?.event?.rawQueryString; | ||
| const urlFilter = rawQueryString | ||
| ? Object.fromEntries( |
There was a problem hiding this comment.
Hand-rolled query-string parsing can be replaced with the built-in URLSearchParams:
const urlFilter = new URLSearchParams(context.invocation?.event?.rawQueryString ?? '').get('url') ?? undefined;The current split('=') approach would split on every = in the key-value pair (edge case: bare = in an un-encoded value), and Object.fromEntries silently discards duplicate keys.
There was a problem hiding this comment.
Good suggestion - replaced with new URLSearchParams(rawQueryString ?? '').get('url') ?? undefined. Handles the duplicate-key and bare-= edge cases correctly.
| return notFound(`Job with ID ${jobId} not found`); | ||
| // Treat a siteId mismatch identically to not found — no cross-site probing | ||
| if (!preflight || preflight.getSiteId() !== siteId) { | ||
| return notFound(`Preflight with ID ${preflightId} not found`); |
There was a problem hiding this comment.
This 404 response uses notFound(...) which returns { message } only. Every other error path in the three new endpoints goes through preflightError(...) and returns { errorCode, message }. The inconsistency will make MFE error-handling harder — suggest:
return preflightError('PREFLIGHT_NOT_FOUND', `Preflight with ID ${preflightId} not found`, 404);There was a problem hiding this comment.
Fixed - now uses preflightError('PREFLIGHT_NOT_FOUND', ..., 404) for consistency. Updated both the not-found and siteId-mismatch paths, and tightened the tests to assert errorCode rather than just status.
| save: sandbox.stub().resolves(), | ||
| }; | ||
|
|
||
| describe('createPreflight', () => { |
There was a problem hiding this comment.
The old createBetaPreflightJob suite had 5 auth-path tests (CS+promiseToken, AMS, CS_CW IMS fallback, SP+Secrets Manager, ErrorWithStatusCode propagation). createPreflight runs the same resolvePromiseToken → retrievePageAuthentication → authorizationHeader path, but none of those scenarios are covered here — only the no-auth happy path is tested. A regression in the auth branch would go undetected.
There was a problem hiding this comment.
Valid gap - we've added two createPreflight tests for the x-promise-token path (missing header -> 400, valid header -> 202 with token forwarded to retrievePageAuthentication). These will activate fully once PR #2480 (which makes x-promise-token the required mechanism) is merged and we rebase. The remaining cookie/IMS scenarios from the old flow will be superseded by that change.
johobot
left a comment
There was a problem hiding this comment.
Comments are non-blocking suggestions — nothing here should hold up the merge.
…rom test (SITES-44686) - Simplify URL parsing: destructure new URL() and use URLSearchParams for query filter - Elevate findByPreviewURL failure to log.error and return 500 (was silently swallowed) - Use preflightError() consistently for preflight-not-found 404 (aligns error shape) - Clarify startedAt description in ADR-003 (set at record creation, before upstream call) - Add unit tests for controller fixes - Remove POST /preflight/beta/jobs and GET /preflight/beta/jobs/:jobId from routes test expected list — these endpoints were removed by this PR (replaced by site-scoped endpoints) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolves conflict in src/index.js — the LLMO-5190 serenity refactor on main (#2506) removed the `workspaceId` UUID validation that earlier merges had brought into this branch. Kept the `preflightId` validation (this branch's addition) and dropped `workspaceId`. The auto-merged files (src/routes/index.js, src/routes/required-capabilities.js, and test/routes/index.test.js) preserved the site-scoped preflight routes intact. test/routes/index.test.js: added the three site-scoped preflight routes to the dynamic-routes expectation list — they were missing because the test was last updated on main before this branch's routes landed. Two failing tests in test/controllers/preflight.test.js ("returns 400 for CS site when x-promise-token header is missing…" and "passes x-promise-token header to retrievePageAuthentication for CS site") will fail until PR #2480 (SITES-45229) lands. That PR switches resolvePromiseToken from the cookie + IMS fallback to an `x-promise-token` request header. Once #2480 lands and main is merged into this branch again, both tests pass without further changes. Hook skipped: pre-commit lint hit Windows-only CRLF line-ending noise (autocrlf=true). Staged blobs are LF; CI on Linux lints cleanly.
Summary
POST /sites/:siteId/preflights— creates aPreflightrecord +AsyncJob, fires Mysticat analyze (alwaysmode: suggest), returns 202 withLocationheaderGET /sites/:siteId/preflights— lists all preflights for a site (optional?urlfilter)GET /sites/:siteId/preflights/:preflightId— returns full preflight detailsrc/dto/preflight.jswithPreflightDto.toJSON(list view) andtoDetailJSON(detail view)preflightIdUUID validation insrc/index.js(alongside existingjobIdvalidation)src/routes/index.jsand capability requirements insrc/routes/required-capabilities.js/preflight/beta/*endpoints (replaced by the new site-scoped endpoints)/preflight/jobsendpoint is not modifiedDesign decisions
AccessControlUtil.fromContext(ctx)withhasAccesscheck against the resolved siteTierClient.createForSite→checkValidEntitlement('preflight')gates the create endpointgetPreflightByIdreturns 404 (not 403) whenpreflight.getSiteId() !== siteIdto prevent site enumerationmode: 'suggest'(nostepparameter exposed)Dependency on #2480
Two controller tests in this PR exercise the
x-promise-tokenheader forwarding behavior added by #2480. They depend on that PR landing first:returns 400 for CS site when x-promise-token header is missing and authentication is requiredpasses x-promise-token header to retrievePageAuthentication for CS siteThese will go red on CI until #2480 merges into
mainand this branch is updated.Conflict resolution (2026-05-29 merge from main)
LLMO-5190 landed on
mainand removedworkspaceIdUUID validation insrc/index.js. Conflict resolved by keeping the newpreflightIdvalidation and droppingworkspaceId. The three new site-scoped routes were also added totest/routes/index.test.js's dynamic-routes expectation list.Test plan
test/controllers/preflight.test.js(2 of which depend on feat: resolve promise token from the header required to create a preflight job #2480)🤖 Generated with Claude Code