Skip to content

tilth reset --to-seed: rewind a session to the post-prep state instead of full teardown #24

Description

@samkeen

Motivation

This is a building-Tilth tool, not a user-facing feature — it speeds the Demo-run protocol per phase (proposals/v1-implementation-plan.md) where we re-run the demo over and over while iterating on the harness itself. The docs should present it as contributor/dev tooling, not part of the getting-started flow.

Today the only reset is a full teardown (tilth resetworkspace.reset_session_state: git worktree remove --force + git branch -D + rm -rf sessions/<id>/), so each test cycle forces a fresh tilth prep-feature — an interactive interview (TTYFrontend.ask_user), the slowest and least-automatable part of the loop.

When we're not testing the seed workflow — only the worker/evaluator/ledger/case mechanics of a phase — the seed is reusable. We want to re-run tilth run against the same committed seed without re-interviewing.

Proposal

tilth reset --to-seed [<session_id>] — a flag on the existing reset subcommand. It rewinds time to the instant prep-feature finished: the seed is intact, and every trace of post-seed activity is gone from both the code and the logs, as if the run never happened.

This is deliberately destructive and lossy — that's the feature. We are not preserving the prior run for comparison (if you want that, run separate sessions). The point is a pristine slate so the next run's logs and worktree carry zero noise from the last attempt. A [y/N] confirm is the safety gate. Hard-delete, no archive, no tip tag — intentional.

Anchor: the seed_committed event records the seed commit (payload.sha + payload.branch) — a clean HEAD on session/<id> before any task work. That sha is the rewind target.

Init sanity check (all must pass before the destructive confirm)

Run these as a pre-flight; on any failure, no-op with a clean error pointing at full tilth reset + tilth prep-feature:

  1. Session dir exists under sessions/<id>/.
  2. A seed_committed event exists → recover seed_sha + branch. A session that never finished prep has no rewind target → clean error.
  3. A session_prepared event exists and carries tokens_used → recover the post-interview token total (see checkpoint step below). (seed_committed implies session_prepared, but read it explicitly rather than assume.)
  4. The worktree exists on diskcheckpoint.workspace is set and the directory is present. The whole op runs git reset/git clean inside the worktree; if it's gone there's nowhere to rewind → clean error (parallel to the existing "session has no worktree recorded" message in do_resume_cmd).
  5. prd.json passes the shape check (below).

Shape check (specifics)

There is no seed/prd version stamp today, so cross-phase incompatibility cannot be detected by version. The shape check is therefore a best-effort structural sanity check that the committed prd.json still matches the contract the runner expects, reusing the seed-writer's rules (tilth/seed/sink.py: REQUIRED_PRD_KEYS, TASK_ID_RE). Concretely, prd.json must:

  • parse as a non-empty JSON list;
  • have every entry be a dict containing all REQUIRED_PRD_KEYS (id, title, description, acceptance_criteria);
  • have every id match T-NNN (TASK_ID_RE, 3+ digits) and be unique;
  • have every acceptance_criteria be a non-empty list.

This mirrors sink._validate minus its test_files coupling (the tests are already committed in the seed, so they aren't re-derived here). It catches gross shape drift — a seed written under an incompatible seeder — not subtle semantic drift. Future tightening: if we later add a SEED_VERSION stamp, this check should also assert version compatibility and reject seeds written under an older phase's seeder. Until then, --to-seed is for iterating within a phase; a structurally-valid but semantically-stale seed is the user's responsibility.

Rewind actions

  • Worktree + branch: git reset --hard <seed_sha> + git clean -fd in the worktree. Drops all post-seed commits — per-task commits and any FAILED (...) placeholder commit (loop.py failure path) — and untracked task files. The orphaned commits are GC'd; no tip tag is kept (no trace, by design). Use -fd, not -fdx: gitignored cruft (__pycache__/, .pytest_cache/) is intentionally left in place — it's harmless to a fresh run, and -x would risk nuking a worktree-local .venv/.env. (Documented caveat: a fresh run is byte-equivalent to post-prep modulo this gitignored cruft.)
  • prd.json: reset every task status back to pending.
  • checkpoint.json: statusprepared; tokens_used → the post-interview total from the session_prepared event. This is a "make it never happened" tool, so the run's entire token spend is erased — tokens_used returns to exactly what it was when the seed completed. (There is no last-completed-task field to clear — next-pending is derived from prd.json statuses, so resetting those is the mechanism.)
  • events.jsonl: truncate to keep every line up to and including the seed_committed event, drop everything after. This preserves the full prep trail — session_start[phase=prep-feature], the interview's intermediate events (model_call, tool_call, memory_load, prompt_assembled, …), session_prepared, and seed_committed — and erases the run. Lossy and intentional; no archive.
  • Delete outright: ledger/, progress.txt, proposed-learnings.md, summary.json, chat.html. All regenerated on the next run. seed-meta.json is preserved — it's part of the seed.

Postcondition

sessions/<id>/ is equivalent to its state the moment prep-feature returned (modulo timestamps and gitignored worktree cruft). The session's checkpoint status is prepared, so tilth run <workspace> picks it up via the prepared-session path (_find_prepared_sessions, source-matched) and starts as a clean first attempt — no re-interview.

Caveats to encode

  • Destructive, irreversible. The confirm prompt must say so plainly — the prior run's code and logs are unrecoverable. Contrast with full tilth reset (also destructive, but obviously so); --to-seed looks gentler, so the warning matters more.
  • Phase-boundary compatibility. Per the plan's "no backwards-compat across phase boundaries," a seed written under one phase's seeder may not be re-runnable under a later phase if the prd.json/seed shape changed. The shape check above catches structural drift; semantic drift is out of detection range until a SEED_VERSION stamp exists.
  • Lossy by design. No archive of the rewound run. This is the intended trade-off for a dev tool whose whole purpose is a pristine next-run slate — accepted explicitly (it diverges from Tilth's general "every run is inspectable" stance, which is why it's a building-Tilth tool, not a user feature).

Out of scope

Related

  • proposals/v1-implementation-plan.mdDemo-run protocol per phase (this directly speeds that loop)
  • workspace.reset_session_state, loop._do_reset, ws.commit_task/commit_seed, the seed_committed / session_prepared / commit events
  • Implementation footprint: --to-seed flag on reset_p (cli.py); a do_reset_to_seed sibling to _do_reset (loop.py) with distinct confirm copy; a ws.rewind_to_seed(worktree, seed_sha) in workspace.py; small helpers to truncate events.jsonl at the boundary and reset prd statuses. Legacy single-dash flag surface is not extended (new feature, subcommand-only).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions