Skip to content

docs: ADR-002 preflight API REST redesign#2377

Merged
GeezerPelletier merged 39 commits into
mainfrom
sites-44290-preflight-api-rest-redesign
May 14, 2026
Merged

docs: ADR-002 preflight API REST redesign#2377
GeezerPelletier merged 39 commits into
mainfrom
sites-44290-preflight-api-rest-redesign

Conversation

@GeezerPelletier
Copy link
Copy Markdown
Contributor

@GeezerPelletier GeezerPelletier commented May 8, 2026

Summary

  • Adds docs/decisions/002-preflight-api-rest-redesign.md capturing the decision to replace /preflight/beta/jobs with site-scoped REST endpoints under /sites/:siteId/preflights, and to deprecate both legacy endpoint pairs
  • No code changes - this is a decision record for team review before implementation begins

Related: SITES-44290 | Bulk endpoint deferred to: SITES-44432

What's being proposed

POST /sites/:siteId/preflights

Request body:

{
  "url": "https://main--site--org.hlx.page/some-path"
}

promiseToken passed via cookie for authenticated CMS pages. createdBy is derived server-side from the caller's IMS profile and never supplied by the client. step is removed - Mysticat always runs the full identify/suggest flow. The submitted url must match one of the site's known hostnames (base URL, preview URL, or live URL) - a mismatch returns PREFLIGHT_INVALID_REQUEST.

Response 202 Accepted:

Headers:

Location: https://spacecat.experiencecloud.live/api/v1/sites/{siteId}/preflights/{preflightId}

Body:

{
  "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "status": "IN_PROGRESS",
  "createdAt": "2026-05-11T10:00:00.000Z",
  "createdBy": {
    "id": "ABC123@techacct.adobe.com",
    "displayName": "John Doe"
  }
}

Error responses:

Status errorCode Condition
400 Bad Request PREFLIGHT_INVALID_REQUEST url is missing, not a valid URI, or does not belong to the site identified by :siteId
403 Forbidden PREFLIGHT_ACCESS_DENIED Caller does not have access to the site
403 Forbidden PREFLIGHT_NOT_ENABLED Preflight is not enabled for the site
404 Not Found PREFLIGHT_SITE_NOT_FOUND siteId does not exist
502 Bad Gateway PREFLIGHT_UPSTREAM_ERROR Mysticat returned a 5xx response
500 Internal Server Error PREFLIGHT_INTERNAL_ERROR Unexpected error within this service

Error response body:

{
  "errorCode": "PREFLIGHT_NOT_ENABLED",
  "message": "Preflight is not enabled for this site"
}

No job record is created for 400, 403, or 404 responses.

GET /sites/:siteId/preflights

Response 200 OK - grouped by URL, capped at 50 most recent preflights per site (sorted by createdAt desc across all URLs before grouping):

[
  {
    "url": "https://main--site--org.hlx.page/some-path",
    "preflights": [
      {
        "preflightId": "3fa85f64-...",
        "status": "COMPLETED",
        "createdAt": "2026-05-11T10:00:00.000Z",
        "updatedAt": "2026-05-11T10:00:05.000Z",
        "createdBy": { "id": "ABC123@techacct.adobe.com", "displayName": "John Doe" }
      }
    ]
  }
]

GET /sites/:siteId/preflights/:preflightId

Response 200 OK - full detail. The handler verifies the loaded job's siteId matches the path's :siteId, returning 404 on mismatch:

{
  "preflightId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "status": "COMPLETED",
  "url": "https://main--site--org.hlx.page/some-path",
  "createdAt": "2026-05-11T10:00:00.000Z",
  "createdBy": { "id": "ABC123@techacct.adobe.com", "displayName": "John Doe" },
  "updatedAt": "2026-05-11T10:00:05.000Z",
  "startedAt": "2026-05-11T10:00:01.000Z",
  "endedAt": "2026-05-11T10:00:05.000Z",
  "result": {},
  "resultType": "INLINE",
  "resultLocation": null,
  "error": null
}

When resultType is S3 or URL, result will be null and the client should fetch from resultLocation.

Data model

AsyncJob will be extended in spacecat-shared-data-access with:

  • siteId as an optional top-level indexed attribute
  • jobType as an optional top-level indexed attribute
  • allBySiteIdAndJobType(siteId, jobType) collection method

Both attributes are optional so that existing job creation paths (including the deprecated /preflight/jobs queue-based flow) continue to work unchanged. url grouping on the list endpoint is an in-memory groupBy on metadata.url after the indexed query - acceptable given the 50-record cap.

createdBy is stored as { id, displayName } in job metadata at creation time - id is profile.email (the IMS user ID) and displayName is composed from profile.first_name + last_name. Both fields are available on the authenticated profile; no additional IMS lookup required. Surfaces in all three endpoint responses for audit purposes.

This is a prerequisite: the spacecat-shared-data-access change must land before controller work begins.

Current endpoint fate

Both legacy endpoint pairs are deprecated and remain functional in parallel with the new endpoints. The Sunset date will be set by PM at the time this ADR moves to Accepted, with a minimum of 90 days from MFE migration start.

Endpoint Action
POST /preflight/beta/jobs Deprecated - internal use only; no external consumers
GET /preflight/beta/jobs/:jobId Deprecated - internal use only; no external consumers
POST /preflight/jobs Deprecated - queue-based; migration timeline to be coordinated with consumers
GET /preflight/jobs/:jobId Deprecated - queue-based; migration timeline to be coordinated with consumers

Test plan

  • Read the ADR and validate the design rationale
  • Confirm the proposed endpoint shape matches team expectations before implementation starts

?? Generated with Claude Code

Guy Pelletier and others added 2 commits May 8, 2026 14:25
Captures the decision to replace /preflight/jobs and /preflight/beta/jobs
with site-scoped endpoints under /sites/:siteId/preflights.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

This PR will trigger no release when merged.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Guy Pelletier and others added 2 commits May 11, 2026 08:54
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@GeezerPelletier GeezerPelletier marked this pull request as draft May 11, 2026 13:07
Guy Pelletier and others added 4 commits May 11, 2026 09:09
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AsyncJob has no siteId index; GET /sites/:siteId/preflights requires
either extending AsyncJob or a new Preflight entity in shared-data-access.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eId index

AsyncJob TTL (~7 days) makes a purpose-built Preflight entity unnecessary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Existing job creation paths must not be affected; new endpoints
populate siteId at creation time and query by it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guy Pelletier and others added 2 commits May 11, 2026 09:54
Capture IMS caller identity server-side at job creation for audit purposes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sponse

Per RFC 7231 §6.3.3 the created resource URL belongs in the Location
response header, not a custom body field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guy Pelletier and others added 3 commits May 11, 2026 10:26
- Add error response table to POST endpoint (400/403/404/500)
- Disabled-site case returns 403 immediately; no job record created
- Previous CANCELLED-job behavior on disabled preflight is explicitly removed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Both /preflight/beta/jobs and /preflight/jobs are deprecated and run in
parallel with the new endpoints until consumers migrate and a deletion
milestone is agreed upon.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@GeezerPelletier GeezerPelletier marked this pull request as ready for review May 11, 2026 14:42
Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Comment thread docs/decisions/002-preflight-api-rest-redesign.md Outdated
```json
{
"url": "https://main--site--org.hlx.page/some-path",
"step": "identify",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concept of step should be going away as we always will perform identify/suggest as part of the mysticat agent step.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point and worth nailing down now. If the Mysticat agent always runs both identify and suggest as a single flow, step is redundant and we should drop it from the new endpoint entirely rather than carry it forward and deprecate it later. Can you confirm that's the direction? If so we'll remove it from the design.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the direction that we want to go.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No other audit has the concept of step, just our preflight ones. So if we rely on other audits for their facts this step concept makes no sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - step has no meaning outside the old preflight-specific flow and is dropped from the new endpoint entirely. The branching in links.js:102 and metatags.js:103 is dead code to be removed as part of the implementation. ADR already reflects this.

{
"url": "https://main--site--org.hlx.page/some-path",
"step": "identify",
"mystiqueUrl": "optional-ephemeral-host.stage.cloud.adobe.io"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the value of sending this back to the client.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify — mystiqueUrl is a request-only field, never returned in any response. It lets developers route a preflight to an ephemeral Mysticat instance for testing without hitting prod, and is blocked in non-dev environments. If you're questioning whether it belongs in the API at all that's a fair debate — it could move to an env-level config instead.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove it from the API response.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"mystiqueUrl" is a request parameter only. Not part of the response payload and removed from any public documentation.

|-------|------|----------|-------------|
| `url` | string (URI) | Yes | The single page URL to analyze |
| `step` | enum: `identify` \| `suggest` | Yes | Audit step to perform |
| `mystiqueUrl` | string | No | Dev-only override for the Mysticat service URL |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

url i don't think needs to be exposed here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The url field is required in the request body because Mysticat needs it to scrape the specific page via DRS headless browser — siteId alone isn't enough since a site has many pages and preflight is per-page. As for the response, it is useful in GET by ID so the consumer knows what was analyzed. Could reasonably be omitted from the 202 POST response body given the Location header already points to the resource.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this should be exposed as part of the public api.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the misread — we interpreted this as being about the url field rather than mystiqueUrl. Agreed, mystiqueUrl is removed from the public documentation. It stays in the implementation as an undocumented dev override for testing against ephemeral Mysticat instances, just not part of the public API contract.

Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Comment thread docs/decisions/002-preflight-api-rest-redesign.md
Copy link
Copy Markdown
Contributor

@bhellema bhellema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really need to consider what the bulk use case looks like.

Guy Pelletier and others added 3 commits May 11, 2026 13:34
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Confirmed by Ben: identify/suggest are no longer separate steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dev-only override, not part of the public contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@GeezerPelletier
Copy link
Copy Markdown
Contributor Author

Gentle reminder @bhellema @johobot @clotton — would appreciate a look when you get a chance. All previous review comments have been addressed. 🙏

Copy link
Copy Markdown

@clotton clotton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving — Johnson's nine review points were all addressed cleanly in the 2026-05-12 push, and on my read the response shapes, validation, error model (502/PREFLIGHT_UPSTREAM_ERROR vs 500/PREFLIGHT_INTERNAL_ERROR), cross-site ownership check, and resultType/resultLocation decision rule are all in good shape. Nice work.

One non-blocking flag: the bulk-submission thread (line 44) ends with a deferral to SITES-44432 for a dedicated POST /sites/:siteId/preflights/bulk endpoint, but @bhellema hasn't explicitly signed off on that deferral yet. Worth getting a 👍 from Ben on the line 44 thread before merge so we don't ship with that thread effectively unresolved.

(Header-vs-cookie auth on line 76 is a known follow-up; happy to take that in a separate ticket.)


### GET /sites/:siteId/preflights

**Response** `200 OK` — grouped by URL, with a nested array of preflights per page. A site can have preflights for multiple URLs. Results are capped at the 50 most recent preflights per site (sorted by `createdAt` descending across all URLs before grouping). Given the 7-day TTL on `AsyncJob` records the natural upper bound is low; full cursor-based pagination is deferred until there is evidence of consumers hitting the cap.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think capping at 50 is interesting, but the payload is so small that I don't see why we need to. I think the limiting factor here is the Lambda response size which is 6 MB for both request and response. So, there's lots of space for a lot of preflights.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - removing the cap entirely. The 7-day TTL on AsyncJob records is the natural bound and in practice the number of preflights per site will be low. All results are returned sorted by createdAt descending; pagination is deferred until there's evidence it's needed. ADR updated.

"status": "COMPLETED",
"createdAt": "2026-05-11T10:00:00.000Z",
"updatedAt": "2026-05-11T10:00:05.000Z",
"createdBy": { "id": "ABC123@techacct.adobe.com", "displayName": "John Doe" }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information is not available today. This would have to be stored as part of the async job metadata.

const profile = context.attributes.authInfo.getProfile();
profile.first_name;
proflile.last_name;
profile.email;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - createdBy is not available today and needs to be stored in async job metadata at creation time. The ADR already captures this: the controller reads profile.email, profile.first_name, and profile.last_name from the authenticated IMS profile at the point the job is created and writes { email, displayName } into the job metadata. No additional lookup needed downstream.

Comment on lines +174 to +176
| `preflights[].createdBy` | object | Caller identity — `{ id, displayName }` |
| `preflights[].createdBy.id` | string | IMS user ID (`profile.email`) |
| `preflights[].createdBy.displayName` | string | Full name from IMS profile |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aligned - both the list and GET by ID responses now use { email, displayName } consistently. ADR updated.

"url": "https://main--site--org.hlx.page/some-path",
"createdAt": "2026-05-11T10:00:00.000Z",
"createdBy": {
"id": "ABC123@techacct.adobe.com",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is the email, then call it the email property, not id.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - renamed to email throughout (202 response, list, and GET by ID). ADR updated.

"createdAt": "2026-05-11T10:00:00.000Z",
"createdBy": {
"id": "ABC123@techacct.adobe.com",
"displayName": "John Doe"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires puling together the first and last name, unless the async job generates this value.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct - displayName is composed from profile.first_name + ' ' + profile.last_name (falling back to profile.name) at job creation time and stored in the job metadata alongside email. The controller handles the composition; nothing is computed on read.

"createdBy": "user@example.com",
"updatedAt": "2026-05-11T10:00:05.000Z",
"startedAt": "2026-05-11T10:00:01.000Z",
"endedAt": "2026-05-11T10:00:05.000Z",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johobot I don't think this is the case for preflight jobs. If we ever leverage bulk creating a massive response this might be the case but JSONB has a size of 1GB.

Digging in I see:

Preflight results are stored inline in async_jobs.result as a JSONB array — the RPC (wrpc_projection_preflight_merge_audit) accumulates page audit entries directly into that column. There's no S3 involved.

The payload per job is bounded: one entry per URL in the request, each with a small set of named audits. For typical preflight requests (a handful of URLs), this is well within what JSONB inline storage handles fine. The resultLocation/resultType fields are left null for preflight jobs — the result column is the storage mechanism here.

So the existing design is reasonable for this use case. resultType: S3 would only make sense if the result could grow large (e.g. bulk imports), which preflight results don't.

asking claude about the context of this thread it responded with (from the projector service)

The PR comment is not accurate for the current implementation. The evidence:

  1. The preflight writer (writer.ts) only calls wrpc_projection_preflight_merge_audit and returns undefined — no S3 writes anywhere.
  2. The RPC (20260317134526_projection_preflight_rpc.sql) only ever writes to async_jobs.result (inline JSONB). resultLocation and resultType are never touched.
  3. The preflight config has no S3 output kind — it uses OutputKind.Preflight, unlike llmo-bp-bulk-import which does use resultLocation.
  4. The API controller reads resultLocation/resultType and returns them, but they will always be null for preflight jobs because nothing sets them.

The reviewer may be thinking of a different job type (llmo-bp-bulk-import does write to S3), or describing aspirational behaviour that doesn't exist yet. The current design is: preflight results are always inline in async_jobs.result.

The resultLocation/resultType fields being returned in the poll response is harmless (they'll be null), but the claim that "results are often written to S3" is incorrect today.

Comment on lines +197 to +198
"resultType": "INLINE",
"resultLocation": null,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to expose these values.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed - both fields are gone from the response shape. ADR updated.

Comment on lines +216 to +217
| `resultType` | enum \| null | `INLINE` — result is in the `result` field; `S3` or `URL` — result is at `resultLocation` and `result` is `null` |
| `resultLocation` | string \| null | S3 URI or URL of the result when `resultType` is `S3` or `URL`; `null` when `resultType` is `INLINE` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is necessary, until it is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the ownership check in - a caller substituting a different :siteId in the path to read another site's preflight result is a real information disclosure risk (scraped page content, draft CMS content). The check is a one-liner in the implementation and the cost of retrofitting it after an incident is far higher than adding it now. @johobot's concern stands.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand on why this check is still necessary even with the existing AccessControlUtil.hasAccess(site) guard:

hasAccess(site) validates that the caller has IMS org access to the site identified by :siteId in the path - that's the first line of defence and it's already consistent with every other site-scoped controller.

But without the ownership check, the handler would then load the preflight job by preflightId alone with no site constraint. A caller who legitimately has access to site A could construct GET /sites/<siteA-id>/preflights/<preflightId-belonging-to-siteB>. They pass hasAccess on site A just fine, and get back site B's preflight result - scraped page content and draft CMS data from a site they have no rights to.

The ownership check closes that gap: it verifies the loaded job's stored siteId matches the path's :siteId. A mismatch returns 404 - the same response as a non-existent ID - so the caller can't even confirm the preflight exists on another site.

The two checks cover different things: hasAccess guards the site in the path; the ownership check guards the job itself. Both are needed.

| `url` | string | The page URL that was analyzed |
| `createdAt` | ISO 8601 | When the preflight was created |
| `createdBy` | object | Caller identity — `{ id, displayName }` |
| `createdBy.id` | string | IMS user ID (`profile.email`) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id -> email

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - see reply above.


The decision is to **extend `AsyncJob` in `spacecat-shared-data-access`**:

- Add `siteId` as an **optional** top-level indexed attribute on the `AsyncJob` schema
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a beefy comment farther up in the document talking about having to update the query, but deleted now seeing this conversation. This might be a good discussion to have with @ekremney.

…resultType

- Flatten GET list response from URL-grouped to flat array sorted by createdAt desc
- Add optional ?url= query parameter to filter list by specific page URL server-side
- Remove 50-record cap; 7-day AsyncJob TTL is the natural bound, pagination deferred
- Rename createdBy.id -> createdBy.email throughout (consistent with profile field name)
- Note that createdBy is stored in async job metadata at creation time
- Remove resultType and resultLocation from GET by ID response; preflight results are
  always stored inline in async_jobs.result, never in S3
- Clarify result field: always inline, null until job completes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guy Pelletier and others added 2 commits May 14, 2026 10:32
…ght entity

Rewrites the Data Model section of ADR-002 to introduce a first-class
Preflight entity in spacecat-shared-data-access rather than extending
the generic AsyncJob model. AsyncJob is shared by preflight, preflight-beta,
site-detection, and pr-review — using it as the preflight store forces
type discrimination via metadata blobs with no schema enforcement.
References SITES-44650 for the spacecat-shared-data-access alignment
work with @ekremney.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Guy Pelletier and others added 2 commits May 14, 2026 11:23
…yncJob

A dedicated Preflight table scopes every row to preflight by definition.
Querying AsyncJob by siteId would require a compound (site_id, job_type)
filter or composite index to isolate preflight records from the other
workflows sharing the table.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The /preflight/beta/jobs endpoint was a temporary internal development
path, never intended as a durable design. Calling it out as an anti-pattern
is noise — it was always going away.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The /preflight/beta/jobs endpoints were an internal development path
with no external consumers. This redesign is their replacement — they
are removed outright, not deprecated. Only /preflight/jobs carries a
deprecation timeline since it has external consumers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@GeezerPelletier
Copy link
Copy Markdown
Contributor Author

Chatted with @bhellema , I'm going to merge this PR just to close off this first and very good and in depth iteration. Subsequent touch points and PRs can provide more updates as needed before the implementation is kicked off.

@GeezerPelletier GeezerPelletier merged commit a1a58ed into main May 14, 2026
8 of 9 checks passed
@GeezerPelletier GeezerPelletier deleted the sites-44290-preflight-api-rest-redesign branch May 14, 2026 18:03
@solaris007
Copy link
Copy Markdown
Member

🎉 This PR is included in version 1.500.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

GeezerPelletier added a commit that referenced this pull request May 22, 2026
…isions (#2461)

## Summary

Updates ADR-002 to reflect decisions from the SITES-44675 alignment
session with @ekdogan. The ADR was previously marked as pending that
alignment — this PR closes it out.

**Changes to the Data Model section:**

- **`asyncJobId` field added to the entity schema** — makes the 1-to-1
FK relationship to `AsyncJob` explicit; field is never exposed to API
consumers
- **Creation flow documented** — controller creates `AsyncJob` first,
receives `asyncJobId`, then creates `Preflight` with that FK; ordering
ensures the execution primitive exists before the domain record that
references it
- **`url` index annotation removed** — `(indexed)` on `url` was
inconsistent with the collection method already specifying in-memory
filtering; composite `(site_id, url)` index is explicitly deferred
(7-day TTL bounds per-site volume to a manageable size)
- **Endpoint coexistence corrected** — `/preflight/beta/jobs` is removed
(replaced by new endpoints), not deprecated; `/preflight/jobs` is the
only legacy endpoint continuing in parallel
- **Alignment placeholder replaced** — "should be aligned with @ekdogan"
replaced with confirmed decisions summary

## Related

- JIRA: [SITES-44290](https://jira.corp.adobe.com/browse/SITES-44290)
- Alignment tracking:
[SITES-44675](https://jira.corp.adobe.com/browse/SITES-44675)
- Original ADR PR: #2377

Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Guy Pelletier <pelletie@adobe.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@ravverma ravverma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @GeezerPelletier,

Retrospective review (PR is already merged as part of v1.500.2). Two reviewers, consolidated below. No Critical issues - the API contract is well-shaped and most hard questions were worked through in the 31 threads before merge. Findings concentrate on a Data Model section that was added late, and a couple of operational postures that would be cheaper to nail down in the doc than to revisit in production.

Strengths

  • Site-scoped REST shape (lines 41-49) aligns with the rest of the service and eliminates server-side URL-to-siteId reverse-resolution. The Context framing ("by the time a user triggers a preflight, the siteId is already in application state") is exactly the right justification.
  • Stable machine-readable error contract (lines 106-123) with errorCode + message split and an explicit note that message "should not be parsed by clients" is the kind of guidance that prevents downstream regressions. Status-to-code mapping table is unambiguous.
  • No phantom CANCELLED records on validation/access failure (lines 125-127) - small but mature operational choice that keeps the job store clean.
  • IDOR defense codified at the contract level (line 218): specifying that ownership-mismatch returns 404 indistinguishable from "not found" is correct, and survived earlier pushback with a clear security argument.
  • /preflight/beta/jobs removed outright (lines 50-60) is the right call when there are no external consumers. Many teams would have left a 90-day shim "to be safe."
  • Honest framing of the bulk deferral - "via multiple parallel requests, no API change needed" is the right answer for v1.
  • A dedicated Preflight entity is actually consistent with the existing codebase (ScrapeJob, ImportJob already sit as peer top-level entities to AsyncJob), even though the ADR reads as if introducing a novel split.

Issues

Important (Should Fix)

  1. Data Model section conflated with the API-shape ADR (lines 257-326). The ADR is titled "Preflight API REST redesign" and frames the problem entirely as endpoint shape and path structure (lines 12-39). The Data Model section was added after the approval and commits to a substantial cross-repo architectural change: dedicated Preflight entity in spacecat-shared-data-access, parallel backing stores, schema migration. That decision has consequences (dual-write lifecycle, cutover plan, query divergence) that are not reflected in the Consequences section (lines 328-340). Bundling it here also means future searches for "why does Preflight have its own table?" land on a doc titled "REST redesign" rather than a dedicated record. Fix: extract to ADR-003 ("Preflight entity vs AsyncJob extension") and cross-reference; have it explicitly call out ScrapeJob/ImportJob as the established precedent (turns this from an ad-hoc divergence into a documented pattern, and gives @ekdogan a single discoverable artifact).

  2. No-cap list + in-memory ?url= filter is a performance footgun written into the contract (lines 139, 306-307). The cap was removed and pagination deferred on the basis that the 7-day TTL is a "natural upper bound." TTL bounds the absolute worst case, not the steady-state response size. With parallel-fire bulk usage as the intended scaling path, an enthusiastic CMS user or a misbehaving MFE can put thousands of preflight rows under one siteId well within the TTL window. allBySiteId returning the whole flat array becomes the hot path, and the ADR documents the ?url= filter as applied in-memory even though url is described as indexed (line 294). Fix: either (a) document allBySiteIdAndUrl(siteId, url) as the query method when ?url= is present so the index is used, or (b) re-introduce a soft cap and commit to cursor-based pagination as the Day-1 contract even if the first cursor returns everything. The deferred-pagination posture is fine; what creates lock-in is the unbounded-array shape becoming load-bearing.

  3. Parallel backing stores have no documented cutover plan (lines 318-321). "Legacy endpoints continue writing to AsyncJob until retired. New endpoints write exclusively to the Preflight entity. Two backing stores in parallel" is the entire treatment of a dual-write window that will run for at least 90 days. Open questions: do clients listing preflights during the window see legacy-API-created jobs, or only new ones? If only new ones, does the MFE need to query both during transition? How does Mysticat's callback contract behave when it knows AsyncJob ids while the new API surface speaks only preflightIds? The operational cost of a dual-store window is invariably underestimated when the migration plan is one sentence - either consumers see a split-brain view or someone writes ad-hoc bridging code that ends up permanent. Fix: add a "Migration" subsection covering (a) whether legacy rows surface through the new list endpoint, (b) Mysticat callback contract during the window, (c) the trigger that ends it.

  4. Deprecation policy for /preflight/jobs is unspecified (lines 251-255). Marked Deprecated with "migration timeline to be coordinated with consumers" and "Sunset set by PM at Accepted." For a deprecation to be actionable, the ADR should commit to the signaling mechanism (response header, release notes, both), who owns identifying and contacting consumers, and the condition that triggers Sunset. Without these, "deprecated" in practice means "kept forever" - I have watched five-year-old "deprecated" endpoints accumulate new dependencies because the obligations were never written down.

Minor (Nice to Have)

  1. Status field values not enumerated in list response (line 172). The example shows COMPLETED and IN_PROGRESS but the closed set (FAILED? CANCELLED? PENDING?) is only stated in the GET-by-ID detail table. The error model is fully tabulated; the status enum deserves the same treatment for the list shape. Consumers writing UI state machines need the closed set.

  2. error field shape unspecified in GET by ID response (line 199, 216). Example shows "error": null; the shape when present is left as { code, message } in the field description but no example or contract for what code values appear. Worth one short subsection paralleling the request error table.

  3. "Preflight may reference an AsyncJob internally" (lines 284-286) leaves the relationship ambiguous. If the Preflight entity owns its own status, timestamps, and result fields (lines 295-302), the AsyncJob link is either redundant (drop it before it accretes consumers) or load-bearing (deserves a sentence on what AsyncJob still provides - retry semantics, SQS dispatch, worker contract). "May reference" invites both interpretations and the ambiguity will be resolved by whoever writes the implementation, not by design.

  4. Status stays "Proposed" (line 4) on a merged and released ADR. By v1.500.2 the status should be "Accepted." Trivial follow-up.

Recommendations

  • When the data-model decision is extracted, explicitly cite ScrapeJob and ImportJob as the established precedent for "domain entity peer to AsyncJob." This converts the choice from an ad-hoc divergence into a documented pattern and answers in advance the next workflow team asking "should we get our own entity too?"
  • Add a short "Backward compatibility" section listing what consumers of /preflight/jobs need to do to migrate (siteId in path, single-URL body, response shape diffs, Location header vs body-embedded pollUrl). Currently lives only in the "Key changes" bullets and is easy to miss.
  • Replace the "alignment with @ekdogan required" line (line 325) with the ticket reference (already there) and a one-line summary of the agreed contract once reached - design docs should not carry unresolved coordination as a permanent footnote.
  • CI failures (build, it-postgres, deploy-stage, branch-deploy) on a docs-only PR are presumably unrelated to the diff. Not blocking retrospectively, but the next reviewer of a similar docs PR will wonder.

Assessment

Retrospective: would have approved with the four Important items addressed before merge. The API contract itself is well-shaped and the dedicated-entity choice is actually correct - it just deserves its own ADR and a documented migration plan rather than riding into a merged "REST redesign" doc after approval. None of the findings block the value already shipped; all are cheaper to settle in the doc as follow-ups than to revisit in production.

Next Steps

  1. Extract Data Model section to ADR-003 (or expand 002's title + Consequences and call out the ScrapeJob/ImportJob precedent).
  2. Decide the ?url= query path: indexed query vs in-memory filter; document accordingly.
  3. Add a Migration subsection covering the dual-store window.
  4. Specify deprecation signaling and Sunset-trigger conditions for /preflight/jobs.
  5. Minor cleanups: status enum in list, error shape, AsyncJob-reference ambiguity, ADR status to Accepted.

@GeezerPelletier
Copy link
Copy Markdown
Contributor Author

@ravverma — thanks for the retrospective review, appreciate the thorough read. Here's how we've addressed the points raised:

1. Scope creep — entity design in an endpoint ADR
Agreed. The Preflight entity design has been extracted into a standalone ADR-003. ADR-002 now holds a short reference to it. Both are included in the implementation PR (#2478).

2. Performance footgun — no pagination cap + in-memory URL filtering
We discussed this and believe the risk is low given the specifics here: results come from a dedicated Preflight table indexed on siteId (not the shared AsyncJob table), preflights are human-triggered from the MFE rather than batch-generated, the 7-day TTL naturally bounds per-site volume, and list items are lightweight with no result payload. ADR-002's list endpoint section now documents this rationale explicitly rather than leaving it implicit. We'll add pagination if evidence of need emerges.

3. Missing dual-store boundary
ADR-003 now explicitly documents that legacy AsyncJob records from /preflight/jobs are not surfaced through the new GET endpoints by design — the two backing stores are independent and legacy records expire naturally via TTL.

4. Deprecation policy for /preflight/jobs
Leaving this as-is for now — the sunset date is PM-driven and we don't want to spec it without that input.

All ADR updates are in the implementation PR: #2478 — happy to take any further feedback there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants