Skip to content

Fix compose-publish x-sam-routes overrides#1403

Draft
simple-agent-manager[bot] wants to merge 4 commits into
mainfrom
sam/x-sam-routes-explain-01kvx9
Draft

Fix compose-publish x-sam-routes overrides#1403
simple-agent-manager[bot] wants to merge 4 commits into
mainfrom
sam/x-sam-routes-explain-01kvx9

Conversation

@simple-agent-manager

Copy link
Copy Markdown
Contributor

Summary

  • Fix compose-publish route derivation so top-level x-sam-routes is an optional override layer on top of the existing ports: default.
  • Preserve the current ports: happy path while allowing explicit public routes without ports:, allowing mode: private to suppress a matching ports: route, deduplicating explicit+ports public routes, and rejecting invalid route entries before apply.
  • Update deployment guide/API/MCP guidance to say ports: is the default route mechanism and x-sam-routes is only needed for explicit overrides.

Validation

  • pnpm lint
  • pnpm typecheck
  • pnpm test
  • Additional validation run: pnpm build; focused API tests; shared compose-parser test/typecheck/lint; documentation sync grep.

Staging Verification (REQUIRED for all code changes — merge-blocking)

All checkboxes below are mandatory for any PR that changes runtime code (.ts, .tsx, .go, etc.). Write N/A: docs-only ONLY if the PR contains zero runtime code changes. See .claude/rules/13-staging-verification.md.

  • Staging deployment greenDeploy Staging workflow triggered manually and passed for this branch
  • Live app verified via Playwright — logged into app.sammy.party (staging) using test credentials and tested the application
  • Existing workflows confirmed working — navigated dashboard, project detail, and settings; confirmed pages load, data displays, navigation works, and no significant console/page errors were introduced
  • New feature/fix verified on staging — BLOCKED: see evidence below
  • Infrastructure verification completed — N/A: no infra changes
  • Mobile and desktop verification notes added for UI changes — N/A: no UI changes

Staging Verification Evidence

  • Staging workflow: https://github.com/raphaeltm/simple-agent-manager/actions/runs/28123725169 passed, including smoke-tests (12 passed).
  • Live API/app regression checks passed:
    • https://api.sammy.party/health returned 200 with {"status":"healthy"}.
    • Authenticated /api/projects, /api/projects/:id/environments, and /api/projects/:id/environments/:environmentId/public-routes returned 200.
    • https://app.sammy.party/ returned 200.
    • Authenticated Playwright smoke visited dashboard, project detail, and settings; no significant browser errors observed.
  • Existing staging compose-publish ports: fallback behavior was verified through the deployed public-routes endpoint.
  • Exact live x-sam-routes compose-publish verification is blocked:
    • Existing staging compose-publish releases do not contain x-sam-routes.
    • The normal compose-publish callback requires a workspace callback JWT/build_and_publish path unavailable from session auth.
    • A temporary environment was created for a direct D1-backed release check, but the D1 insert failed with Cloudflare API permission [code: 7500]; the temporary environment was cleaned up.
  • Merge is intentionally deferred until the exact callback/build path can be verified with workspace credentials or D1 write access.

UI Compliance Checklist (Required for UI changes)

  • Mobile-first layout verified — N/A: no UI changes
  • Accessibility checks completed — N/A: no UI changes
  • Shared UI components used or exception documented — N/A: no UI changes
  • Playwright visual audit run locally — N/A: no UI changes

End-to-End Verification (Required for multi-component changes)

  • Data flow traced from user input to final outcome with code path citations (see .claude/rules/10-e2e-verification.md)
  • Capability test exercises the complete happy path across system boundaries
  • All spec/doc assumptions about existing behavior verified against code (not just "read the code")
  • If any gap exists between automated test coverage and full E2E, manual verification steps documented below

Data Flow Trace

  1. Captured compose YAML is parsed in buildComposePublishApplyPayload() and service container ports are read with collectServiceContainerPorts() in apps/api/src/services/compose-publish-apply.ts.
  2. collectComposePublishPublicRoutes() in apps/api/src/services/deployment-routing.ts parses explicit x-sam-routes, validates service/port/mode, rejects provider services, and applies precedence so explicit service+port entries win over ports: hints.
  3. buildComposePublishApplyPayload() deletes raw service ports, assigns route targets with assignRouteTargets(), then injects only routed services as 127.0.0.1:<hostPort>:<containerPort> bindings.
  4. buildComposePublishRouteTargets() uses the same collectComposePublishPublicRoutes() + assignRouteTargets() path when reconstructing routes from stored compose-publish releases.
  5. The focused tests cover public explicit routes without ports:, private overrides suppressing ports:, duplicate suppression, invalid route errors, and stored-release route reconstruction.

Untested Gaps

  • Exact live staging verification for a newly submitted compose-publish release containing x-sam-routes is blocked by missing workspace callback JWT/build_and_publish access and Cloudflare D1 write permission. Do not merge until that path is verified.

Post-Mortem (Required for bug fix PRs)

What broke

Compose-publish deployment guidance described x-sam-routes support, but the compose-publish apply/reconstruction path only derived public routes from service ports:.

Root cause

The compose-publish raw-compose path had separate route derivation from the normalized manifest parser and never consumed top-level x-sam-routes.

Class of bug

Documentation/implementation contract drift in deployment route derivation.

Why it wasn't caught

Existing tests covered the ports: route path and normalized parser behavior, but not raw compose-publish x-sam-routes precedence or stored-release reconstruction.

Process fix included in this PR

Focused regression tests now cover compose-publish x-sam-routes behavior in apply-time route assignment and stored-release reconstruction.

Post-mortem file

tasks/archive/2026-06-24-compose-publish-x-sam-routes.md

Specialist Review Evidence (Required for agent-authored PRs)

If local subagents were used during Phase 5, list every reviewer below. Do NOT merge until every row shows PASS or ADDRESSED. If any reviewer could not complete (timeout, workspace killed, error), you MUST add the needs-human-review label and stop — do not self-merge. See .claude/rules/25-review-merge-gate.md.

  • All local reviewers completed and findings addressed before merge
  • If any reviewer did NOT complete: needs-human-review label added and merge deferred to human — N/A for reviewer completion; label is added for the staging feature-verification blocker.
Reviewer Status Outcome
task-completion-validator PASS Planned work maps to diff and tests; no UI/backend or multi-resource gaps.
cloudflare-specialist PASS No D1/KV/R2/wrangler changes; route derivation remains deterministic and malformed stored releases use existing safe empty-route behavior.
doc-sync-validator PASS Docs/MCP/API references now describe ports: as default and x-sam-routes as explicit override.
constitution-validator PASS No hardcoded internal URLs/timeouts/limits added; new numeric values are protocol/test example ports or existing configurable TCP bound constants.
test-engineer PASS Focused Vitest coverage exercises default ports, explicit public, private suppression, duplicate suppression, invalid route errors, and stored-release reconstruction.
security-auditor PASS Public/private exposure behavior is validated; provider services cannot be exposed through x-sam-routes; no credential/auth/session changes.

Exceptions (If any)

  • Scope: Exact live staging verification of a newly submitted x-sam-routes compose-publish release is incomplete.
  • Rationale: Required workspace callback JWT/build_and_publish access is unavailable from session auth, and direct D1 setup was blocked by Cloudflare API permission [code: 7500].
  • Expiration: Before merge; verify the exact staging callback/build path with workspace credentials or D1 write access.

Agent Preflight (Required)

  • Preflight completed before code changes

Classification

  • external-api-change
  • cross-component-change
  • business-logic-change
  • public-surface-change
  • docs-sync-change
  • security-sensitive-change
  • ui-change
  • infra-change

External References

N/A: implementation was based on repository code, tests, and SAM deployment docs/tool guidance.

Codebase Impact Analysis

  • apps/api/src/services/deployment-routing.ts: shared compose-publish route collection and stored-release target reconstruction.
  • apps/api/src/services/compose-publish-apply.ts: apply-time route extraction, loopback binding rewrite, and private override behavior.
  • apps/api/tests/unit/services/*: regression coverage for apply-time and reconstruction behavior.
  • apps/api/src/routes/mcp/deployment-guide-tools.ts, apps/www/src/content/docs/docs/guides/app-deployments.md, apps/www/src/content/docs/docs/reference/api.md: docs/tool wording.
  • packages/shared/src/compose-parser/parse.ts: diagnostic wording aligned with current route hints.

Documentation & Specs

Updated deployment guide, API reference, and MCP deployment guide tool text to document ports: as the default and x-sam-routes as the override.

Constitution & Risk Check

Checked Principle XI/no hardcoded values and deployment exposure risk. The change avoids new hardcoded deployment constants, keeps hostnames/host ports assigned through existing configurable route options, and validates private/public exposure semantics with tests.

@simple-agent-manager

Copy link
Copy Markdown
Contributor Author

Merge is deferred for staging verification.

What passed:

What is still missing:

  • Exact live staging verification of a newly submitted compose-publish release containing x-sam-routes.

Why it is blocked:

  • Existing staging compose-publish releases do not include x-sam-routes.
  • The normal compose-publish callback path requires a workspace callback JWT/build_and_publish flow that is not available through session login.
  • A direct D1-backed release setup attempt was blocked by Cloudflare API permission [code: 7500]; the temporary environment created for that attempt was cleaned up.

Do not merge until the exact staging callback/build path can be verified with workspace credentials or D1 write access.

@simple-agent-manager simple-agent-manager Bot added the needs-human-review Agent could not complete all review gates — human must approve before merge label Jun 24, 2026
@codspeed-hq

codspeed-hq Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Merging this PR will improve performance by 11.37%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 1 improved benchmark
✅ 5 untouched benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
getModelGroupsForAgent - all agents 20.4 µs 18.3 µs +11.37%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing sam/x-sam-routes-explain-01kvx9 (fabb294) with main (8c6f3ba)

Open in CodSpeed

…ain-01kvx9

# Conflicts:
#	apps/api/src/routes/mcp/deployment-guide-tools.ts
@sonarqubecloud

Copy link
Copy Markdown

@simple-agent-manager

Copy link
Copy Markdown
Contributor Author

Current rescue status after rebasing/merging main into this PR:\n\n- Branch conflict is resolved and pushed at fabb294111bd85fd690bf492b94cd0d47834801a; GitHub reports the PR as MERGEABLE.\n- CI is green except Specialist Review Evidence, which is failing only because the PR still has needs-human-review.\n- Local validation passed: pnpm lint, pnpm typecheck, pnpm test, pnpm build, pnpm quality:migration-safety, focused API tests. Local go test ./... in packages/vm-agent could not run because Go is not installed in this container.\n- Staging deploy passed on this branch: https://github.com/raphaeltm/simple-agent-manager/actions/runs/28164472227, including smoke-tests (12 passed).\n- Local authenticated staging smoke also passed after installing Playwright runtime deps: SMOKE_TEST_TOKEN=... pnpm exec playwright test --config=tests/smoke/playwright.config.ts -> 12 passed.\n- Safe staging D1 observation: SELECT count(*) ... source='compose-publish' AND manifest LIKE '%x-sam-routes%' returned 0, so there is no existing compose-publish x-sam-routes release to inspect.\n- SAM MCP deployment access still reports no allowed environments for this agent (list_deployment_environments returns empty), so I still cannot run the exact build_and_publish/workspace callback path for a fresh x-sam-routes compose-publish release from this session.\n\nI am intentionally leaving needs-human-review and draft status in place because the PR body's exact-feature staging gate remains unsatisfied. The code/merge conflict/staging-regression side is cleaned up; the remaining decision is whether a human with deployment environment/callback access verifies the exact compose-publish path or explicitly accepts the unit/integration/staging-regression evidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-human-review Agent could not complete all review gates — human must approve before merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant