Skip to content

feat(sdk): session_reset() for mid-conversation segment boundaries#13

Merged
OmkarRayAI merged 1 commit into
mainfrom
feat/session-reset
Jun 5, 2026
Merged

feat(sdk): session_reset() for mid-conversation segment boundaries#13
OmkarRayAI merged 1 commit into
mainfrom
feat/session-reset

Conversation

@OmkarRayAI

Copy link
Copy Markdown
Owner

Summary

Ships wikitrace.session_reset() to close one Twitter-feedback gap: when a user clears chat history or a planner restarts mid-trace, callers had no way to mark the boundary. Spans before and after the reset all showed up as one long thread.

API

with wikitrace.session(id="conv-1", user="alice"):
    chain.invoke({"input": q1})              # session_segment=0
    wikitrace.session_reset()                # returns 1
    chain.invoke({"input": "start over"})    # session_segment=1
  • Same session_id throughout — cost rollups and user attribution stay grouped
  • Distinct session_segment integers so the dashboard can render separate threads
  • Outside an active session it's a no-op returning 0 — safe to call from library code

Test plan

  • test_session_reset_segments_under_same_session_id — three turns, two resets, assert session_id stable and session_segment increments
  • test_session_reset_outside_session_is_noop — returns 0, does not raise
  • Full suite: pytest -q tests/ → 98 passed, 14 skipped (no regressions)

What this is NOT

  • Not the dashboard tree view (Twitter Q1's other half) — that's a separate, larger PR (~1 day of dashboard work).
  • Not subagent cost rollup (Twitter Q2) — also separate.

This is the smallest possible SDK move that lets you reply "shipped" to the segment-boundary part of the feedback today.

🤖 Generated with Claude Code

Closes a Twitter-feedback gap: when a user clears chat history or a
planner restarts from a checkpoint mid-trace, callers had no way to
mark the boundary. Spans before and after the reset would all show
up as one long thread in the dashboard.

API
- wikitrace.session_reset() returns the new segment integer (1, 2, ...)
- Spans before the first reset carry no session_segment attr (segment 0)
- Spans after each reset carry session_segment=<n>
- Same session_id throughout, so cost rollups and user attribution
  continue to group across the entire conversation
- Outside an active session it is a no-op returning 0, so library
  code that calls it defensively does not crash unwrapped callers

Tests
- test_session_reset_segments_under_same_session_id: three turns,
  two resets, asserts session_id stable and session_segment increments
- test_session_reset_outside_session_is_noop: returns 0, does not raise

README
- New Mid-conversation resets subsection under the Sessions block,
  with a code example

Verified: pytest -q tests/ -> 98 passed (up from 96), 14 skipped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@OmkarRayAI OmkarRayAI merged commit 7a40e22 into main Jun 5, 2026
4 checks passed
@OmkarRayAI OmkarRayAI deleted the feat/session-reset branch June 5, 2026 12:51
OmkarRayAI pushed a commit that referenced this pull request Jun 5, 2026
Closes Twitter Q1B: a single page that walks one trace start-to-finish
in human-readable form. Previously the dashboard rendered one request
at a time; understanding what an agent actually did across N llm_calls,
M tool_calls, and possibly nested subagents required clicking through
each individually.

What lands

New route: app/app/(obs)/traces/[id]/page.tsx
- Header: trace_id + spans count + total latency + total cost +
  total tokens + session_id + user_id
- Body: collapsible tree using native HTML details elements
  (server-rendered, zero client JS for collapse). Auto-expands
  the root and any agent_call nodes; leaves llm_calls and
  tool_calls collapsed by default for scanability.
- Per-node: kind badge (Agent / LLM / Tool / Action / Retrieve /
  Judge), agent or tool or model label, duration, tokens, cost,
  start time. agent_call nodes also show subtree rollup (cost +
  nested-subagent count + descendant llm count).
- Inline content: pulls prompt and response out of attrs.request
  and attrs.response (Helicone-proxy shape) plus generic fallbacks
  (input, output, prompt, answer, arg.0, question). Truncated to
  1500 chars with a scrollable container; full data is in the raw
  span dump at the bottom.
- Events drawer per span (collapsed by default, first 50 events).
- Per-segment grouping: when any span carries session_segment > 0
  (from PR #13 mid-conversation reset), root spans are grouped
  into Initial conversation / After reset #1 / After reset #2.
- Raw JSON drawer at the bottom for power users.
- Error spans get a red dot indicator.

app/lib/traces.ts
- Exported treeCostFromSpans so the page can decorate agent_call
  nodes with their subtree totals without re-loading spans.

Reuses existing infra
- loadTraceSpansAsync: cloud-mode-aware loader from PR #1.
  Cloud-mode tenants only see their own trace spans.
- treeCostFromSpans: subtree rollup logic from PR #14.
- (obs) layout, glass styling, eyebrow/PageTitle widgets.

Verified
- npx tsc --noEmit clean for all my files
- pytest -q tests/ -> 106 passed, 14 skipped (no regressions)
- Renders correctly against fixtures with nested subagent calls,
  per-span events, error spans, multi-segment sessions, missing
  attrs (defensive fallbacks throughout)

Linked from /agents
- The /agents page rows already link href=/traces/{trace_id}
  (from PR #14). With this PR shipped, that link now lands on the
  full conversation walkthrough instead of the wiki-themed
  engineering view.

This closes the third and largest of the Twitter feedback gaps
(after PR #13 session_reset and PR #14 agent cost rollup).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
OmkarRayAI added a commit that referenced this pull request Jun 5, 2026
…#15)

Closes Twitter Q1B: a single page that walks one trace start-to-finish
in human-readable form. Previously the dashboard rendered one request
at a time; understanding what an agent actually did across N llm_calls,
M tool_calls, and possibly nested subagents required clicking through
each individually.

What lands

New route: app/app/(obs)/traces/[id]/page.tsx
- Header: trace_id + spans count + total latency + total cost +
  total tokens + session_id + user_id
- Body: collapsible tree using native HTML details elements
  (server-rendered, zero client JS for collapse). Auto-expands
  the root and any agent_call nodes; leaves llm_calls and
  tool_calls collapsed by default for scanability.
- Per-node: kind badge (Agent / LLM / Tool / Action / Retrieve /
  Judge), agent or tool or model label, duration, tokens, cost,
  start time. agent_call nodes also show subtree rollup (cost +
  nested-subagent count + descendant llm count).
- Inline content: pulls prompt and response out of attrs.request
  and attrs.response (Helicone-proxy shape) plus generic fallbacks
  (input, output, prompt, answer, arg.0, question). Truncated to
  1500 chars with a scrollable container; full data is in the raw
  span dump at the bottom.
- Events drawer per span (collapsed by default, first 50 events).
- Per-segment grouping: when any span carries session_segment > 0
  (from PR #13 mid-conversation reset), root spans are grouped
  into Initial conversation / After reset #1 / After reset #2.
- Raw JSON drawer at the bottom for power users.
- Error spans get a red dot indicator.

app/lib/traces.ts
- Exported treeCostFromSpans so the page can decorate agent_call
  nodes with their subtree totals without re-loading spans.

Reuses existing infra
- loadTraceSpansAsync: cloud-mode-aware loader from PR #1.
  Cloud-mode tenants only see their own trace spans.
- treeCostFromSpans: subtree rollup logic from PR #14.
- (obs) layout, glass styling, eyebrow/PageTitle widgets.

Verified
- npx tsc --noEmit clean for all my files
- pytest -q tests/ -> 106 passed, 14 skipped (no regressions)
- Renders correctly against fixtures with nested subagent calls,
  per-span events, error spans, multi-segment sessions, missing
  attrs (defensive fallbacks throughout)

Linked from /agents
- The /agents page rows already link href=/traces/{trace_id}
  (from PR #14). With this PR shipped, that link now lands on the
  full conversation walkthrough instead of the wiki-themed
  engineering view.

This closes the third and largest of the Twitter feedback gaps
(after PR #13 session_reset and PR #14 agent cost rollup).

Co-authored-by: Omkar Ray <your real email>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant