refactor(server): fetch boot context from ContexGin HTTP API#361
Conversation
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 6 issue(s) (3 warning).
server/chat.ts
Solid simplification from dynamic-import library to HTTP API with graceful fallback chain. Main concerns: tokenBudget semantics changed (now always equals tokenCount instead of reflecting configured budget), and included/trimmed section metadata is permanently empty in the new path.
- 🟡 regressions (L897): The call site
fetchBootContext(agentName)omitscontexginUrlandrepoRoot, so both use defaults. The defaultrepoRootisBASE_REPO(which could be''ifREPO_PATHis unset). If ContexGin is down andBASE_REPOis empty,localBootContextFallbackwill try to execscripts/build_boot_context.pyrelative to cwd — but this is caught and returns the zero-value fallback, so it degrades gracefully. Not a bug, but worth noting the old code passedcwd(the resolved working directory) rather thanBASE_REPOfor the fallback script. If ContexGin is offline and the session uses a worktreecwd, the fallback script runs againstBASE_REPO(the parent repo) instead of the worktree — this is likely correct (the script reads canonical source files), but the behavior changed from the old code which usedcwd. - 🟡 regressions (L212): The old code set
tokenBudgetfrom the agent recipe'scontext.boot.tokenBudgetfield (defaulting to 8000). The new code setstokenBudgetequal totokenCount(the actual token count from the server response). This meanstokenBudgetno longer reflects the configured budget — it just mirrorstokenCount. Any frontend logic that comparestokenCountvstokenBudget(e.g., to show utilization percentage) will always see 100%. The client-side protocol parser atpackages/client/src/protocol-parser.ts:260does forwardtokenBudgetto the store, so if anything renders budget vs. used, it's now a no-op comparison.[fixable] - 🟡 unsafe_assumptions (L141): The
localBootContextFallbackhardcodessourceCount: 5and a fixed list of 5 source paths, with a comment saying it's coupled toSOURCE_PATHSinscripts/build_boot_context.py. If that script changes its source list, this will silently drift out of sync. Consider reading the script's output for source metadata instead of duplicating the list, or at minimum making the hardcoded count derive from the array length (sources.lengthinstead of the literal5).[fixable] - 🔵 style (L898): The double cast
msg as unknown as Record<string, unknown>is unnecessarily defensive.BootContextMessageis a plain interface with only primitive/array fields — a singlemsg as Record<string, unknown>suffices since the types are structurally compatible. Alternatively, widen thesend()helper to acceptBootContextMessagedirectly.[fixable] - 🔵 regressions (L214): The old code populated
includedandtrimmedarrays with section-level metadata (source file, heading path, token estimates, content) from the contexgin compile result. The new HTTP API path always returns empty arrays for both. If the ContexGin HTTP API does (or will) return this data, it's being silently dropped. If the API intentionally omits it, this is fine — but theBootContextMessageinterface still declares these fields, and the client parser still parses them, so there's dead plumbing.
server/__tests__/boot-context.test.ts
Solid simplification from dynamic-import library to HTTP API with graceful fallback chain. Main concerns: tokenBudget semantics changed (now always equals tokenCount instead of reflecting configured budget), and included/trimmed section metadata is permanently empty in the new path.
- 🔵 missing_tests: No test covers the
AbortSignal.timeout(5000)behavior — e.g., verifying that a slow ContexGin response triggers the fallback within the timeout window. This is difficult to test in unit tests but worth noting as an untested edge case.
| } | ||
| })(); | ||
| // Fire-and-forget: fetch boot context from running ContexGin server | ||
| fetchBootContext(agentName) |
There was a problem hiding this comment.
🟡 regressions: The call site fetchBootContext(agentName) omits contexginUrl and repoRoot, so both use defaults. The default repoRoot is BASE_REPO (which could be '' if REPO_PATH is unset). If ContexGin is down and BASE_REPO is empty, localBootContextFallback will try to exec scripts/build_boot_context.py relative to cwd — but this is caught and returns the zero-value fallback, so it degrades gracefully. Not a bug, but worth noting the old code passed cwd (the resolved working directory) rather than BASE_REPO for the fallback script. If ContexGin is offline and the session uses a worktree cwd, the fallback script runs against BASE_REPO (the parent repo) instead of the worktree — this is likely correct (the script reads canonical source files), but the behavior changed from the old code which used cwd.
There was a problem hiding this comment.
Acknowledged — the fallback running against BASE_REPO instead of worktree cwd is indeed the correct behavior since the script reads canonical source files. No change needed.
| source: 'contexgin', | ||
| sourceCount: sources.length, | ||
| tokenCount: bootTokens, | ||
| tokenBudget: bootTokens, // server compiles to its own budget — report actual |
There was a problem hiding this comment.
🟡 regressions: The old code set tokenBudget from the agent recipe's context.boot.tokenBudget field (defaulting to 8000). The new code sets tokenBudget equal to tokenCount (the actual token count from the server response). This means tokenBudget no longer reflects the configured budget — it just mirrors tokenCount. Any frontend logic that compares tokenCount vs tokenBudget (e.g., to show utilization percentage) will always see 100%. The client-side protocol parser at packages/client/src/protocol-parser.ts:260 does forward tokenBudget to the store, so if anything renders budget vs. used, it's now a no-op comparison. [fixable]
There was a problem hiding this comment.
Fixed — added DEFAULT_TOKEN_BUDGET = 8000 constant. The ContexGin path now reads boot.tokenBudget from the response (falling back to 8000), and the local fallback uses the constant. Budget no longer mirrors tokenCount.
| return { | ||
| type: 'boot_context', | ||
| source: 'local-fallback', | ||
| // Coupled to SOURCE_PATHS in scripts/build_boot_context.py — update if that list changes |
There was a problem hiding this comment.
🟡 unsafe_assumptions: The localBootContextFallback hardcodes sourceCount: 5 and a fixed list of 5 source paths, with a comment saying it's coupled to SOURCE_PATHS in scripts/build_boot_context.py. If that script changes its source list, this will silently drift out of sync. Consider reading the script's output for source metadata instead of duplicating the list, or at minimum making the hardcoded count derive from the array length (sources.length instead of the literal 5). [fixable]
There was a problem hiding this comment.
Fixed — extracted the sources array to a variable, now uses sources.length instead of the hardcoded literal 5.
| })(); | ||
| // Fire-and-forget: fetch boot context from running ContexGin server | ||
| fetchBootContext(agentName) | ||
| .then((msg) => send(transport, msg as unknown as Record<string, unknown>)) |
There was a problem hiding this comment.
🔵 style: The double cast msg as unknown as Record<string, unknown> is unnecessarily defensive. BootContextMessage is a plain interface with only primitive/array fields — a single msg as Record<string, unknown> suffices since the types are structurally compatible. Alternatively, widen the send() helper to accept BootContextMessage directly. [fixable]
There was a problem hiding this comment.
Addressed — widened send() to accept Record<string, unknown> | BootContextMessage, removing the double cast at the call site. The internal as Record<string, unknown> remains because SessionTransport.send() requires it.
| tokenCount: bootTokens, | ||
| tokenBudget: bootTokens, // server compiles to its own budget — report actual | ||
| sources, | ||
| included: [], |
There was a problem hiding this comment.
🔵 regressions: The old code populated included and trimmed arrays with section-level metadata (source file, heading path, token estimates, content) from the contexgin compile result. The new HTTP API path always returns empty arrays for both. If the ContexGin HTTP API does (or will) return this data, it's being silently dropped. If the API intentionally omits it, this is fine — but the BootContextMessage interface still declares these fields, and the client parser still parses them, so there's dead plumbing.
There was a problem hiding this comment.
Fixed — included and trimmed are now parsed from the ContexGin HTTP response when present, instead of always returning empty arrays.
Replace the inline `import('contexgin')` library call with an HTTP fetch
to the running ContexGin server (`GET /api/agents/:name/context`).
The old approach imported contexgin as a library, read the agent recipe
from the worktree (which could be stale or missing), used the legacy
`compile()` pipeline, and fell back to a hardcoded 8k token budget.
The running server already compiles correctly with the recipe's 12k
budget — this change makes Mitzo use it.
- Extract `fetchBootContext()` as a testable, exported function
- Use `CONTEXGIN_URL` env var (already configured in .env)
- 5s timeout with graceful local-fallback on any error
- Remove ~140 lines of library import, recipe parsing, and response
mapping code
- Add 7 focused tests covering success, failure, and edge cases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When ContexGin is unreachable, fall back to `build_boot_context.py --json` which reads the canonical source files directly (Profile, Constitution, Services) without any server dependency. This is the deterministic "old way" — always works, no budget drift. Fallback chain: ContexGin HTTP → build_boot_context.py → zero-value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add .catch() on fire-and-forget boot context promise - Use encodeURIComponent for agent name in URL path - Remove redundant `as string` cast inside type guard - Add coupling comment on hardcoded SOURCE_PATHS list - Add test for malformed JSON response fallback - Wrap env var test cleanup in try/finally Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix tokenBudget regression: read from ContexGin response instead of mirroring tokenCount (restores budget vs actual comparison) - Parse included/trimmed sections from ContexGin response instead of returning empty arrays - Fix hardcoded sourceCount: derive from sources.length - Widen send() to accept BootContextMessage directly (no double cast) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4440b8d to
f550197
Compare
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 5 issue(s) (1 critical) (1 warning).
server/__tests__/boot-context.test.ts
The tokenBudget test/implementation mismatch (mock omits tokenBudget but test expects 11297, implementation defaults to 8000) will cause a test failure. The core refactor from npm-imported contexgin to HTTP API is sound with good fallback chains, but needs test coverage for tokenBudget parsing and included/trimmed section parsing.
- 🔴 bugs (L104): Test expects
tokenBudget: 11297but the mock response at line 84 has notokenBudgetfield inboot. The implementation (chat.ts:199) defaults to8000whenboot.tokenBudgetis absent:typeof boot.tokenBudget === 'number' ? boot.tokenBudget : 8000. This test will fail. Either addtokenBudget: 11297to the mock'sbootobject, or change the expectation to8000.[fixable] - 🟡 missing_tests: No test case verifies that
tokenBudgetis correctly parsed when the API response does includeboot.tokenBudget. The existing happy-path test omits it, so thetypeof boot.tokenBudget === 'number'branch is never exercised. Add a test where the mock includes an explicittokenBudgetvalue distinct fromtokensto confirm both fields are parsed independently.[fixable] - 🔵 missing_tests: No test covers the
included/trimmedsection parsing (parseSectionsat chat.ts:210-218). The happy-path test receives noincluded/trimmedin the mock, so the array-filtering and field-coercion logic inparseSectionsis untested. A test with mockincluded/trimmedarrays containing objects with partial fields would cover theString(s.source ?? '')coercion paths.[fixable]
server/chat.ts
The tokenBudget test/implementation mismatch (mock omits tokenBudget but test expects 11297, implementation defaults to 8000) will cause a test failure. The core refactor from npm-imported contexgin to HTTP API is sound with good fallback chains, but needs test coverage for tokenBudget parsing and included/trimmed section parsing.
- 🔵 style (L384): The
as Record<string, unknown>cast works aroundSessionTransport.send()only acceptingRecord<string, unknown>. It's safe sinceBootContextMessageis a plain object, but the union type onsend()(line 383) is misleading — it suggests type safety that the cast immediately discards. Consider either wideningSessionTransport.send()to acceptunknownor keepingsend()typed as justRecord<string, unknown>and casting at the call site instead.[fixable] - 🔵 unsafe_assumptions (L138): The hardcoded source list in
localBootContextFallback(lines 138-144) assumes the mgmt workspace structure (memory/Profile/Working Style.md,CONSTITUTION.md,SERVICES.md). These paths won't exist in other repos that might use Mitzo. Since this is explicitly a fallback for when ContexGin is down and scoped to theREPO_PATHworkspace, it's acceptable — but a comment noting this is mgmt-specific would help future readers.
| source: 'contexgin', | ||
| sourceCount: 2, | ||
| tokenCount: 11297, | ||
| tokenBudget: 11297, |
There was a problem hiding this comment.
🔴 bugs: Test expects tokenBudget: 11297 but the mock response at line 84 has no tokenBudget field in boot. The implementation (chat.ts:199) defaults to 8000 when boot.tokenBudget is absent: typeof boot.tokenBudget === 'number' ? boot.tokenBudget : 8000. This test will fail. Either add tokenBudget: 11297 to the mock's boot object, or change the expectation to 8000. [fixable]
There was a problem hiding this comment.
Fixed — added tokenBudget: 11297 to the mock ContexGin response so the test validates end-to-end budget propagation.
| function send(transport: SessionTransport, data: Record<string, unknown>) { | ||
| if (transport.isOpen()) transport.send(data); | ||
| function send(transport: SessionTransport, data: Record<string, unknown> | BootContextMessage) { | ||
| if (transport.isOpen()) transport.send(data as Record<string, unknown>); |
There was a problem hiding this comment.
🔵 style: The as Record<string, unknown> cast works around SessionTransport.send() only accepting Record<string, unknown>. It's safe since BootContextMessage is a plain object, but the union type on send() (line 383) is misleading — it suggests type safety that the cast immediately discards. Consider either widening SessionTransport.send() to accept unknown or keeping send() typed as just Record<string, unknown> and casting at the call site instead. [fixable]
There was a problem hiding this comment.
Fair point — the union type on send() is a compromise. SessionTransport.send() is typed as Record<string, unknown> (harness package), so we can't avoid the internal cast without widening the harness interface. Keeping the union at send() at least makes the call sites cleaner.
| const content = parsed.additionalContext ?? ''; | ||
| // Rough token estimate: ~4 chars per token | ||
| const tokens = Math.ceil(content.length / 4); | ||
| const sources = [ |
There was a problem hiding this comment.
🔵 unsafe_assumptions: The hardcoded source list in localBootContextFallback (lines 138-144) assumes the mgmt workspace structure (memory/Profile/Working Style.md, CONSTITUTION.md, SERVICES.md). These paths won't exist in other repos that might use Mitzo. Since this is explicitly a fallback for when ContexGin is down and scoped to the REPO_PATH workspace, it's acceptable — but a comment noting this is mgmt-specific would help future readers.
There was a problem hiding this comment.
Acknowledged — these paths are mgmt-specific and only used when ContexGin is offline. This is an acceptable coupling since Mitzo's primary deployment is the mgmt workspace.
…ignment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The mock ContexGin response was missing the tokenBudget field, causing the test to expect 11297 while the code defaults to 8000. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Replaces #358 (CI was stuck on stale branch).
Test plan
🤖 Generated with Claude Code