fix: persist boot context in system prompt and sticky banner#369
fix: persist boot context in system prompt and sticky banner#369dimakis wants to merge 2 commits into
Conversation
…y banner Boot context was fire-and-forget — sent as a UI event but never included in the system prompt. On SDK context compaction the model lost it entirely. The banner also sat outside the scroll container as a flex sibling, which caused it to be pushed offscreen on mobile Safari as messages accumulated. - Await fetchBootContext() and append fullMarkdown to systemPromptAppend so it survives compaction - Move SessionBanner inside ChatArea's scroll container with position: sticky so it pins to the top regardless of scroll position - Keep the UI event send for display/replay purposes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 4 issue(s) (2 warning).
server/chat.ts
Solid refactor that correctly moves boot context from fire-and-forget to awaited (safe — fetchBootContext never throws) and promotes SessionBanner into ChatArea for sticky scroll behavior. Main concern is the added latency on session start (up to 10s worst case) and missing test coverage for the system prompt append logic.
- 🟡 unsafe_assumptions (L905): Changing fetchBootContext from fire-and-forget to
awaitadds up to 10 seconds of latency before the SDKquery()call (5s ContexGin timeout + 5s Python fallback timeout in the worst case). The old fire-and-forget approach let the session start immediately while boot context loaded in parallel. This tradeoff is intentional (system prompt needs the content), but worth noting: if ContexGin is slow/down and the Python fallback is also slow, session start stalls noticeably. - 🔵 regressions (L912): The boot context WS message is now sent before
query()starts, so there's no SDK session ID yet for new (non-resume) sessions.bootSidfalls through toregistry.get(clientId)?.sessionId, which is the Mitzo worktree ID (set during session registration), not the SDK session UUID. This is consistent with pre-existing behavior (the old.then()also used the same fallback), so not a regression — but the comment 'Include sessionId for consistency' from the old code was removed, losing that documentation. - 🟡 missing_tests (L906): The new behavior — boot context markdown being appended to the system prompt — has no test coverage. The existing
boot-context.test.tstestsfetchBootContextin isolation, but no test verifies thatsystemPromptAppendincludes the boot context content or that it's correctly gated onfullMarkdownbeing truthy. An integration-level test (or a unit test for the prompt assembly logic) would catch regressions if the append format changes.[fixable]
frontend/src/pages/ChatView.tsx
Solid refactor that correctly moves boot context from fire-and-forget to awaited (safe — fetchBootContext never throws) and promotes SessionBanner into ChatArea for sticky scroll behavior. Main concern is the added latency on session start (up to 10s worst case) and missing test coverage for the system prompt append logic.
- 🔵 style (L5): The removal of the
SessionBannerimport leaves a blank line between theChatInputimport and theVoiceSettingsimport. Minor nit — the blank line is harmless but may be caught by a formatter rule expecting consistent spacing between import groups.[fixable]
|
|
||
| // Fetch boot context BEFORE building system prompt so it's part of the | ||
| // system prompt append and survives SDK context compaction. | ||
| const bootContextMsg = await fetchBootContext(agentName); |
There was a problem hiding this comment.
🟡 unsafe_assumptions: Changing fetchBootContext from fire-and-forget to await adds up to 10 seconds of latency before the SDK query() call (5s ContexGin timeout + 5s Python fallback timeout in the worst case). The old fire-and-forget approach let the session start immediately while boot context loaded in parallel. This tradeoff is intentional (system prompt needs the content), but worth noting: if ContexGin is slow/down and the Python fallback is also slow, session start stalls noticeably.
|
|
||
| // Send boot context to UI for display (separate from system prompt injection) | ||
| const bootSid = options.resume ?? registry.get(clientId)?.sessionId; | ||
| send(transport, { ...bootContextMsg, ...(bootSid ? { sessionId: bootSid } : {}) }); |
There was a problem hiding this comment.
🔵 regressions: The boot context WS message is now sent before query() starts, so there's no SDK session ID yet for new (non-resume) sessions. bootSid falls through to registry.get(clientId)?.sessionId, which is the Mitzo worktree ID (set during session registration), not the SDK session UUID. This is consistent with pre-existing behavior (the old .then() also used the same fallback), so not a regression — but the comment 'Include sessionId for consistency' from the old code was removed, losing that documentation.
| // Fetch boot context BEFORE building system prompt so it's part of the | ||
| // system prompt append and survives SDK context compaction. | ||
| const bootContextMsg = await fetchBootContext(agentName); | ||
| const bootContextAppend = bootContextMsg.fullMarkdown |
There was a problem hiding this comment.
🟡 missing_tests: The new behavior — boot context markdown being appended to the system prompt — has no test coverage. The existing boot-context.test.ts tests fetchBootContext in isolation, but no test verifies that systemPromptAppend includes the boot context content or that it's correctly gated on fullMarkdown being truthy. An integration-level test (or a unit test for the prompt assembly logic) would catch regressions if the append format changes. [fixable]
| import { ChatArea } from '../components/ChatArea'; | ||
| import { ChatInput } from '../components/ChatInput'; | ||
| import { SessionBanner } from '../components/SessionBanner'; | ||
|
|
There was a problem hiding this comment.
🔵 style: The removal of the SessionBanner import leaves a blank line between the ChatInput import and the VoiceSettings import. Minor nit — the blank line is harmless but may be caught by a formatter rule expecting consistent spacing between import groups. [fixable]
- Use stateSessionId instead of re-deriving bootSid (clearer, same value) - Use session directly instead of registry.get() (already resolved) - Add tests for boot context system prompt append pattern - Fix blank import lines in ChatView and DesktopChatView Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dimakis
left a comment
There was a problem hiding this comment.
Centaur Review
Found 4 issue(s) (2 warning).
server/chat.ts
The core change (boot context in system prompt for compaction survival) is sound, but awaiting the fetch adds up to 5s startup latency on ContexGin failures, and the sessionId may be undefined for non-resume sessions at the point of send/persist.
- 🟡 regressions (L905): Changing fetchBootContext from fire-and-forget to awaited adds up to 5 seconds (the AbortSignal.timeout) to session startup latency when ContexGin is slow or unreachable, plus additional time for the local fallback subprocess. The old code let the session start immediately while boot context loaded in the background. Consider whether this latency tradeoff is acceptable — the user will see a blank chat for longer on ContexGin timeouts.
[fixable] - 🟡 regressions (L912): The old fire-and-forget code looked up
session.sessionIdviaregistry.get(clientId)at resolution time (after the SDK query started and assigned the real sessionId). The new code usesstateSessionIdwhich isoptions.resume ?? session.sessionIdcomputed before the SDK query starts. For non-resume sessions,session.sessionIdis undefined at this point (it gets set later by the query loop), so the boot_context message sent to the UI and the EventStore persistence will both lack a sessionId. The old code had the same race for the initial send but at least had a chance of resolving it if the fetch was slow enough. Verify this doesn't break client-side session demuxing.[fixable]
server/__tests__/boot-context.test.ts
The core change (boot context in system prompt for compaction survival) is sound, but awaiting the fetch adds up to 5s startup latency on ContexGin failures, and the sessionId may be undefined for non-resume sessions at the point of send/persist.
- 🔵 missing_tests (L232): The two new tests reconstruct the append pattern inline and assert against it, but they only test
fetchBootContextreturn values — not thatchat.tsactually appends the result tosystemPromptAppend. The append logic in chat.ts (lines 906-908, 932) has no direct test coverage. An integration test that verifies the system prompt passed toquery()contains the boot context would be more valuable.[fixable] - 🔵 style (L249): The test at lines 249-254 is tautological — it constructs a template string and then asserts it contains substrings that are obviously present. It doesn't test any production code path; it tests the test's own string interpolation. The meaningful assertion is on line 248 (
result.fullMarkdown === markdown), which the existing test structure already covers.[fixable]
|
|
||
| // Fetch boot context BEFORE building system prompt so it's part of the | ||
| // system prompt append and survives SDK context compaction. | ||
| const bootContextMsg = await fetchBootContext(agentName); |
There was a problem hiding this comment.
🟡 regressions: Changing fetchBootContext from fire-and-forget to awaited adds up to 5 seconds (the AbortSignal.timeout) to session startup latency when ContexGin is slow or unreachable, plus additional time for the local fallback subprocess. The old code let the session start immediately while boot context loaded in the background. Consider whether this latency tradeoff is acceptable — the user will see a blank chat for longer on ContexGin timeouts. [fixable]
|
|
||
| // Send boot context to UI for display (separate from system prompt injection). | ||
| // stateSessionId is the worktree-level session ID, already computed above. | ||
| send(transport, { ...bootContextMsg, ...(stateSessionId ? { sessionId: stateSessionId } : {}) }); |
There was a problem hiding this comment.
🟡 regressions: The old fire-and-forget code looked up session.sessionId via registry.get(clientId) at resolution time (after the SDK query started and assigned the real sessionId). The new code uses stateSessionId which is options.resume ?? session.sessionId computed before the SDK query starts. For non-resume sessions, session.sessionId is undefined at this point (it gets set later by the query loop), so the boot_context message sent to the UI and the EventStore persistence will both lack a sessionId. The old code had the same race for the initial send but at least had a chance of resolving it if the fetch was slow enough. Verify this doesn't break client-side session demuxing. [fixable]
| expect(result.fullMarkdown).toBe('json fallback'); | ||
| }); | ||
|
|
||
| it('returns fullMarkdown suitable for system prompt append', async () => { |
There was a problem hiding this comment.
🔵 missing_tests: The two new tests reconstruct the append pattern inline and assert against it, but they only test fetchBootContext return values — not that chat.ts actually appends the result to systemPromptAppend. The append logic in chat.ts (lines 906-908, 932) has no direct test coverage. An integration test that verifies the system prompt passed to query() contains the boot context would be more valuable. [fixable]
| const result = await fetchBootContext('mitzo-conversational', CONTEXGIN_URL); | ||
|
|
||
| expect(result.fullMarkdown).toBe(markdown); | ||
| // Verify the append pattern matches what chat.ts does: |
There was a problem hiding this comment.
🔵 style: The test at lines 249-254 is tautological — it constructs a template string and then asserts it contains substrings that are obviously present. It doesn't test any production code path; it tests the test's own string interpolation. The meaningful assertion is on line 248 (result.fullMarkdown === markdown), which the existing test structure already covers. [fixable]
Summary
fetchBootContext()is awaited andfullMarkdownis appended tosystemPromptAppendso it survives compaction.ChatArea's scroll container withposition: sticky; top: 0so it pins to the top regardless of scroll position.Test plan
🤖 Generated with Claude Code