Skip to content

fix: persist boot context in system prompt and sticky banner#369

Open
dimakis wants to merge 2 commits into
mainfrom
fix/boot-context-persistence
Open

fix: persist boot context in system prompt and sticky banner#369
dimakis wants to merge 2 commits into
mainfrom
fix/boot-context-persistence

Conversation

@dimakis
Copy link
Copy Markdown
Owner

@dimakis dimakis commented Jun 5, 2026

Summary

  • Server: Boot context was fire-and-forget — sent as a UI event but never included in the system prompt. On SDK context compaction the model lost it entirely. Now fetchBootContext() is awaited and fullMarkdown is appended to systemPromptAppend so it survives compaction.
  • Frontend: SessionBanner sat outside the scroll container as a flex sibling, causing it to be pushed offscreen on mobile Safari as messages accumulated. Moved inside ChatArea's scroll container with position: sticky; top: 0 so it pins to the top regardless of scroll position.

Test plan

  • Start a new session on mobile — verify boot context banner appears
  • Send 3-4 messages — verify banner stays pinned at top of scroll area
  • Scroll through long conversation — banner stays visible
  • Verify boot context still displays on desktop
  • Verify model retains boot context behavior after extended conversation (compaction)

🤖 Generated with Claude Code

…y banner

Boot context was fire-and-forget — sent as a UI event but never included
in the system prompt. On SDK context compaction the model lost it entirely.
The banner also sat outside the scroll container as a flex sibling, which
caused it to be pushed offscreen on mobile Safari as messages accumulated.

- Await fetchBootContext() and append fullMarkdown to systemPromptAppend
  so it survives compaction
- Move SessionBanner inside ChatArea's scroll container with
  position: sticky so it pins to the top regardless of scroll position
- Keep the UI event send for display/replay purposes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@dimakis dimakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 4 issue(s) (2 warning).

server/chat.ts

Solid refactor that correctly moves boot context from fire-and-forget to awaited (safe — fetchBootContext never throws) and promotes SessionBanner into ChatArea for sticky scroll behavior. Main concern is the added latency on session start (up to 10s worst case) and missing test coverage for the system prompt append logic.

  • 🟡 unsafe_assumptions (L905): Changing fetchBootContext from fire-and-forget to await adds up to 10 seconds of latency before the SDK query() call (5s ContexGin timeout + 5s Python fallback timeout in the worst case). The old fire-and-forget approach let the session start immediately while boot context loaded in parallel. This tradeoff is intentional (system prompt needs the content), but worth noting: if ContexGin is slow/down and the Python fallback is also slow, session start stalls noticeably.
  • 🔵 regressions (L912): The boot context WS message is now sent before query() starts, so there's no SDK session ID yet for new (non-resume) sessions. bootSid falls through to registry.get(clientId)?.sessionId, which is the Mitzo worktree ID (set during session registration), not the SDK session UUID. This is consistent with pre-existing behavior (the old .then() also used the same fallback), so not a regression — but the comment 'Include sessionId for consistency' from the old code was removed, losing that documentation.
  • 🟡 missing_tests (L906): The new behavior — boot context markdown being appended to the system prompt — has no test coverage. The existing boot-context.test.ts tests fetchBootContext in isolation, but no test verifies that systemPromptAppend includes the boot context content or that it's correctly gated on fullMarkdown being truthy. An integration-level test (or a unit test for the prompt assembly logic) would catch regressions if the append format changes. [fixable]

frontend/src/pages/ChatView.tsx

Solid refactor that correctly moves boot context from fire-and-forget to awaited (safe — fetchBootContext never throws) and promotes SessionBanner into ChatArea for sticky scroll behavior. Main concern is the added latency on session start (up to 10s worst case) and missing test coverage for the system prompt append logic.

  • 🔵 style (L5): The removal of the SessionBanner import leaves a blank line between the ChatInput import and the VoiceSettings import. Minor nit — the blank line is harmless but may be caught by a formatter rule expecting consistent spacing between import groups. [fixable]

Comment thread server/chat.ts

// Fetch boot context BEFORE building system prompt so it's part of the
// system prompt append and survives SDK context compaction.
const bootContextMsg = await fetchBootContext(agentName);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 unsafe_assumptions: Changing fetchBootContext from fire-and-forget to await adds up to 10 seconds of latency before the SDK query() call (5s ContexGin timeout + 5s Python fallback timeout in the worst case). The old fire-and-forget approach let the session start immediately while boot context loaded in parallel. This tradeoff is intentional (system prompt needs the content), but worth noting: if ContexGin is slow/down and the Python fallback is also slow, session start stalls noticeably.

Comment thread server/chat.ts Outdated

// Send boot context to UI for display (separate from system prompt injection)
const bootSid = options.resume ?? registry.get(clientId)?.sessionId;
send(transport, { ...bootContextMsg, ...(bootSid ? { sessionId: bootSid } : {}) });
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 regressions: The boot context WS message is now sent before query() starts, so there's no SDK session ID yet for new (non-resume) sessions. bootSid falls through to registry.get(clientId)?.sessionId, which is the Mitzo worktree ID (set during session registration), not the SDK session UUID. This is consistent with pre-existing behavior (the old .then() also used the same fallback), so not a regression — but the comment 'Include sessionId for consistency' from the old code was removed, losing that documentation.

Comment thread server/chat.ts
// Fetch boot context BEFORE building system prompt so it's part of the
// system prompt append and survives SDK context compaction.
const bootContextMsg = await fetchBootContext(agentName);
const bootContextAppend = bootContextMsg.fullMarkdown
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 missing_tests: The new behavior — boot context markdown being appended to the system prompt — has no test coverage. The existing boot-context.test.ts tests fetchBootContext in isolation, but no test verifies that systemPromptAppend includes the boot context content or that it's correctly gated on fullMarkdown being truthy. An integration-level test (or a unit test for the prompt assembly logic) would catch regressions if the append format changes. [fixable]

Comment thread frontend/src/pages/ChatView.tsx Outdated
import { ChatArea } from '../components/ChatArea';
import { ChatInput } from '../components/ChatInput';
import { SessionBanner } from '../components/SessionBanner';

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: The removal of the SessionBanner import leaves a blank line between the ChatInput import and the VoiceSettings import. Minor nit — the blank line is harmless but may be caught by a formatter rule expecting consistent spacing between import groups. [fixable]

- Use stateSessionId instead of re-deriving bootSid (clearer, same value)
- Use session directly instead of registry.get() (already resolved)
- Add tests for boot context system prompt append pattern
- Fix blank import lines in ChatView and DesktopChatView

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@dimakis dimakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Centaur Review

Found 4 issue(s) (2 warning).

server/chat.ts

The core change (boot context in system prompt for compaction survival) is sound, but awaiting the fetch adds up to 5s startup latency on ContexGin failures, and the sessionId may be undefined for non-resume sessions at the point of send/persist.

  • 🟡 regressions (L905): Changing fetchBootContext from fire-and-forget to awaited adds up to 5 seconds (the AbortSignal.timeout) to session startup latency when ContexGin is slow or unreachable, plus additional time for the local fallback subprocess. The old code let the session start immediately while boot context loaded in the background. Consider whether this latency tradeoff is acceptable — the user will see a blank chat for longer on ContexGin timeouts. [fixable]
  • 🟡 regressions (L912): The old fire-and-forget code looked up session.sessionId via registry.get(clientId) at resolution time (after the SDK query started and assigned the real sessionId). The new code uses stateSessionId which is options.resume ?? session.sessionId computed before the SDK query starts. For non-resume sessions, session.sessionId is undefined at this point (it gets set later by the query loop), so the boot_context message sent to the UI and the EventStore persistence will both lack a sessionId. The old code had the same race for the initial send but at least had a chance of resolving it if the fetch was slow enough. Verify this doesn't break client-side session demuxing. [fixable]

server/__tests__/boot-context.test.ts

The core change (boot context in system prompt for compaction survival) is sound, but awaiting the fetch adds up to 5s startup latency on ContexGin failures, and the sessionId may be undefined for non-resume sessions at the point of send/persist.

  • 🔵 missing_tests (L232): The two new tests reconstruct the append pattern inline and assert against it, but they only test fetchBootContext return values — not that chat.ts actually appends the result to systemPromptAppend. The append logic in chat.ts (lines 906-908, 932) has no direct test coverage. An integration test that verifies the system prompt passed to query() contains the boot context would be more valuable. [fixable]
  • 🔵 style (L249): The test at lines 249-254 is tautological — it constructs a template string and then asserts it contains substrings that are obviously present. It doesn't test any production code path; it tests the test's own string interpolation. The meaningful assertion is on line 248 (result.fullMarkdown === markdown), which the existing test structure already covers. [fixable]

Comment thread server/chat.ts

// Fetch boot context BEFORE building system prompt so it's part of the
// system prompt append and survives SDK context compaction.
const bootContextMsg = await fetchBootContext(agentName);
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 regressions: Changing fetchBootContext from fire-and-forget to awaited adds up to 5 seconds (the AbortSignal.timeout) to session startup latency when ContexGin is slow or unreachable, plus additional time for the local fallback subprocess. The old code let the session start immediately while boot context loaded in the background. Consider whether this latency tradeoff is acceptable — the user will see a blank chat for longer on ContexGin timeouts. [fixable]

Comment thread server/chat.ts

// Send boot context to UI for display (separate from system prompt injection).
// stateSessionId is the worktree-level session ID, already computed above.
send(transport, { ...bootContextMsg, ...(stateSessionId ? { sessionId: stateSessionId } : {}) });
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 regressions: The old fire-and-forget code looked up session.sessionId via registry.get(clientId) at resolution time (after the SDK query started and assigned the real sessionId). The new code uses stateSessionId which is options.resume ?? session.sessionId computed before the SDK query starts. For non-resume sessions, session.sessionId is undefined at this point (it gets set later by the query loop), so the boot_context message sent to the UI and the EventStore persistence will both lack a sessionId. The old code had the same race for the initial send but at least had a chance of resolving it if the fetch was slow enough. Verify this doesn't break client-side session demuxing. [fixable]

expect(result.fullMarkdown).toBe('json fallback');
});

it('returns fullMarkdown suitable for system prompt append', async () => {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 missing_tests: The two new tests reconstruct the append pattern inline and assert against it, but they only test fetchBootContext return values — not that chat.ts actually appends the result to systemPromptAppend. The append logic in chat.ts (lines 906-908, 932) has no direct test coverage. An integration test that verifies the system prompt passed to query() contains the boot context would be more valuable. [fixable]

const result = await fetchBootContext('mitzo-conversational', CONTEXGIN_URL);

expect(result.fullMarkdown).toBe(markdown);
// Verify the append pattern matches what chat.ts does:
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 style: The test at lines 249-254 is tautological — it constructs a template string and then asserts it contains substrings that are obviously present. It doesn't test any production code path; it tests the test's own string interpolation. The meaningful assertion is on line 248 (result.fullMarkdown === markdown), which the existing test structure already covers. [fixable]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant