Fix per-node Origin CA certificate provisioning#1413
Draft
simple-agent-manager[bot] wants to merge 3 commits into
Draft
Fix per-node Origin CA certificate provisioning#1413simple-agent-manager[bot] wants to merge 3 commits into
simple-agent-manager[bot] wants to merge 3 commits into
Conversation
|
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Fixes VM-001 (HIGH, CWE-312/522) by removing the platform-shared Cloudflare Origin CA private key from static cloud-init user-data for new VM nodes.
New nodes now:
/etc/sam/tls/origin-ca-key.pemlocally during cloud-initPOST /api/nodes/:id/origin-ca-certificatewith the node callback JWTThe API Worker signs the CSR through Cloudflare Origin CA using
CF_API_TOKEN; it never returns or embeds a platform-wide private key.Tradeoffs
This chooses per-node key material with a node-scoped signing endpoint rather than ACME or a broader secret retrieval service. It is the smallest compatible change for the current VM routing model.
The issued certificates remain wildcard-scoped for
*.BASE_DOMAIN,*.vm.BASE_DOMAIN, andBASE_DOMAINso existingws-*and node VM hostnames continue to work. The private key is no longer shared fleet-wide, but full hostname minimization would require a larger routing and certificate model change.apps/api/src/infra/resources/origin-ca.tsis intentionally left in place to avoid destructive protected-resource/Pulumi state churn in this PR. Runtime provisioning and deployment secret configuration no longer copy or requireORIGIN_CA_CERT/ORIGIN_CA_KEYfor new nodes.Validation
pnpm --filter @simple-agent-manager/cloud-init testpnpm --filter @simple-agent-manager/api test -- tests/unit/services/origin-ca-certificates.test.ts tests/unit/node-callback-scope-enforcement.test.tspnpm --filter @simple-agent-manager/cloud-init typecheckpnpm --filter @simple-agent-manager/api lint(existing warnings only)pnpm --filter @simple-agent-manager/api typecheckpnpm lint && pnpm typecheck && pnpm test && pnpm build(passed; existing warnings only)pnpm --filter @simple-agent-manager/api test -- tests/unit/node-callback-scope-enforcement.test.ts tests/unit/services/origin-ca-certificates.test.tspnpm --filter @simple-agent-manager/api typecheckHuman Gate
Draft PR only. Do not merge yet.
Per task constraints, I did not deploy to staging and did not provision real VMs. Staging deploy, real VM provisioning, and TLS handshake verification are explicitly left for human review.