Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ Bach is a CLI-first personal agentic development session launcher that turns dai
- `uv run bach daemon install` - Install `com.bach.daemon` as a macOS LaunchAgent (RunAtLoad + KeepAlive); prints `bach daemon run` fallback if launchctl fails
- `uv run bach daemon uninstall` - Unload and remove the LaunchAgent plist; idempotent

### PRD#17 — Scheduled Supervision Sweeps
- `uv run bach supervise [run] [--task <id>] [--force]` - One-shot sweep: enumerate live task-linked sessions → judge each once → apply verdicts → print summary table; `--task` restricts to one session; `--force` bypasses the active-hours window
- `uv run bach supervise install` - Install `com.bach.supervise` as a macOS LaunchAgent (calendar-scheduled, NOT keep-alive); fires N times/day inside the active-hours window (default N=5, hours [8,11,14,16,19]); prints `bach supervise` fallback if launchctl fails; re-run after changing sweep knobs
- `uv run bach supervise uninstall` - Unload and remove the sweep agent plist; idempotent; does not touch `com.bach.daemon`
- Sweep knobs (via `bach config set`): `supervise-sweeps-per-day` (default 5), `supervise-window-start-hour` (default 8), `supervise-window-end-hour` (default 22)

## File References
- `docs/internal/DISCOVERY.md` - real goal and scope
- `docs/internal/PRD.md` - requirements
Expand Down
245 changes: 245 additions & 0 deletions docs/adr/018-scheduled-supervision-sweeps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,245 @@
# ADR-018 — Scheduled Supervision Sweeps (PRD #17)

**Status:** Accepted
**Date:** 2026-06-13
**Extends:** ADR-015 (session observability, judge_session + verdict apply),
ADR-017 (persistent watcher daemon, always-on alternative).

## Context

ADR-017 shipped a persistent daemon (`com.bach.daemon`) that keeps one watcher
alive per live session. The watcher calls the judge on every `judge_interval_seconds`
tick (~300 s by default), producing roughly 180 codex calls per session per day
for a typical 15-hour active window — continuous cost paid regardless of whether
the user is actually monitoring.

A lower-cost alternative emerged from two observations:

1. Most users check in a few times per day rather than monitoring continuously.
Real-time observation is useful but not always worth its price.
2. A single sweep pass (enumerate sessions → judge each once → apply) costs one
LLM call per session. At 5 sweeps/day that is ~36× cheaper than the daemon's
continuous tick pace.

PRD #17 asks: **can we make supervision opt-in at low flat cost, with a manual
escape hatch, while leaving the daemon available for users who genuinely need
near-real-time verdicts?**

Two design questions needed explicit decisions:

1. **New observer code path or shared path?** The sweep needs to read transcripts,
call the judge, and apply verdicts exactly as the watcher does. Forking a
second code path would create two places to keep in sync.

2. **How to guard against waking up at 03:00?** A naive launchd `StartInterval`
agent fires on every reboot regardless of time-of-day. The sweep should be
a day-hours operation.

## Decision

### 1. One-shot sweep model

`bach supervise` (bare) or `bach supervise run` runs **one sweep pass** and
exits. The sweep:

1. Enumerates live, task-linked sessions via the same path as the daemon
(`scan_registry_artifact_paths` → `discover_sessions` → `list_sessions` →
`list_supervisable_artifacts`).
2. For each artifact, reads the transcript tail and calls the judge **once**
(one codex call per session — the cost-containment invariant).
3. Applies the verdict via the shared `verdict_apply.apply_verdict` path.
4. Prints a Rich per-session summary table.
5. Exits.

There is no persistent process. The sweep is a dumb batch job: fire, iterate,
exit. All state after the sweep lives in the task artifacts.

### 2. Shared judge + apply path — no new observer code

The sweep reuses:

- `judge_service.judge_session` — the same judge callable the watcher uses.
- `verdict_apply.apply_verdict` — the shared apply function (extracted in this
PR from `session_watcher.py` into its own module so both callers can import
it). This means the observer authority matrix (`source="observer"`), the
`observer_moves`/confidence-threshold guard, and the fallback log-append
on write failure all apply identically to sweeps and the continuous watcher.
- `list_supervisable_artifacts` — the same supervisable-artifacts filter.

Rationale: one code path means one place to audit, one place to update when
the judge API changes, and guaranteed behavioral parity between the two
supervision modes.

### 3. Active-hours window guard

`run_sweep` checks `now.hour` against
`[supervise_window_start_hour, supervise_window_end_hour)` (defaults 08–22,
local time) before doing any work. If the current hour is outside the window
and `force=False`, the sweep returns a `SweepReport(skipped=True)` — no judge
calls, no disk writes, no output beyond a muted log line.

`--force` bypasses the window entirely for manual at-any-hour use (e.g.
debugging, late-night emergency check).

The window guard exists because launchd calendar agents can fire at unexpected
times (reboot at 02:00 catches up all missed intervals). The window makes
late-night sweeps a logged no-op rather than a surprise LLM charge.

### 4. Per-session error isolation

Each artifact is processed inside a `try/except` in `_sweep_one`. An
exception in one session is recorded as a `SweepResult` with `error` set and
the sweep continues to the next session. This mirrors the daemon's per-watcher
resilience: one bad artifact must not abort the entire batch.

### 5. macOS launchd calendar agent with a distinct label

`bach supervise install` writes a **calendar agent** plist at
`~/Library/LaunchAgents/com.bach.supervise.plist`. Key differences from the
daemon plist (`com.bach.daemon`):

| Property | Daemon (`com.bach.daemon`) | Sweep agent (`com.bach.supervise`) |
|---|---|---|
| `RunAtLoad` | `true` | absent |
| `KeepAlive` | `true` | absent |
| `StartCalendarInterval` | absent | N `{Hour, Minute}` entries |

`StartCalendarInterval` with an array of entries fires at each listed hour.
No `KeepAlive` or `RunAtLoad` — the sweep is not a persistent service; it is
a cron-like trigger for a one-shot command.

The **distinct label** (`com.bach.supervise` vs `com.bach.daemon`) means:

- The two agents are fully independent on disk and in launchd. Installing or
uninstalling one does not affect the other.
- A user can run `bach daemon install` and `bach supervise install`
simultaneously (both register successfully). Running both is allowed but
redundant: the sweep fires at calendar times while the daemon is already
judging continuously. The daemon's continuous verdicts are not harmed by the
periodic sweep calling the judge again; the apply path is idempotent.

On `launchctl` failure (non-macOS, SIP restrictions, sandboxed environment),
the plist is still written and the fallback command `bach supervise` is printed.
This mirrors the daemon install pattern from ADR-017 §5.

### 6. Sweep-hour spread formula

The `_compute_sweep_hours` function computes N evenly-spaced integer hours:

```
entry[i].Hour = round(window_start + i * span / N)
```

`round()` (not `floor()`) is used so the last entry reaches near `window_end`
rather than clustering early. For N=5 over the default 08–22 window:

```
i=0: round(8 + 0 * 14/5) = round(8.0) = 8
i=1: round(8 + 1 * 14/5) = round(10.8) = 11
i=2: round(8 + 2 * 14/5) = round(13.6) = 14
i=3: round(8 + 3 * 14/5) = round(16.4) = 16
i=4: round(8 + 4 * 14/5) = round(19.2) = 19
→ [8, 11, 14, 16, 19]
```

`floor()` would give `[8, 10, 13, 16, 18]` — the span is the same but the
distribution is less uniform (shorter first gap, longer last gap vs the full
14-hour window). `round()` spreads them closer to uniform.

### 7. Cost comparison

| Mode | Judge calls per session per 15-hour day |
|---|---|
| Daemon (continuous, 300 s tick) | ~180 |
| Sweeps at 5/day | 5 |
| **Ratio** | **~36× cheaper** |

## Known limitations / accepted trade-offs

*These are deliberate design choices, not bugs.*

**(a) Staleness between sweeps.**
A session's status is observed at most N times per day. A blocked session
might stay undetected for hours between sweep fires. Users who need near-real-
time observation should use the daemon instead.

**(b) Window is local-hour integer granularity.**
`supervise_window_start_hour` and `supervise_window_end_hour` are integers
(0–24). Sub-hour precision (e.g. "start at 08:30") is not supported. The
cost of adding minute-precision for a scheduling knob that changes at most
once a year is not worth the complexity.

**(c) Sweep-hour spread is rounded integer hours.**
The `round()` formula produces near-uniform distribution but not perfectly
uniform — rounding creates small gaps (e.g. 14→16 is a 2-hour gap vs 11→14
is 3 hours for N=5). This is accepted over floating-point scheduling, which
launchd `StartCalendarInterval` does not support anyway.

**(d) Running both daemon and sweep agent is redundant.**
Bach does not prevent this configuration. The second call to `apply_verdict`
will find the verdict already applied and either do nothing (same status) or
see an `InvalidStatusError` caught and logged. No data is corrupted, but one
judge call per sweep fire is wasted. Users who install both should be aware
they gain no benefit from the sweep agent.

## Consequences

- **`bach supervise [run]` is a new CLI surface.** Implemented in
`cli/supervise_cmd.py`. Options: `--task <id>`, `--force`.

- **`bach supervise install` / `uninstall` are new CLI commands.** They manage
the `com.bach.supervise` LaunchAgent independently of `bach daemon`.

- **Three new config knobs** (all via `bach config set`):

| Key | Default | Effect |
|-----|---------|--------|
| `supervise-window-start-hour` | `8` | Active-hours window start (inclusive) |
| `supervise-window-end-hour` | `22` | Active-hours window end (exclusive) |
| `supervise-sweeps-per-day` | `5` | Calendar entries per day |

- **`verdict_apply.py` is a new module** extracted from `session_watcher.py` so
both the watcher and the sweep can import the canonical apply path. Existing
behavior of the continuous watcher is unchanged.

- **The daemon (`com.bach.daemon`) is unchanged.** No existing behavior was
modified. Scheduled sweeps are an additive, opt-in alternative.

## Alternatives rejected

- **Reuse the daemon with a low reconcile interval.** Lowering
`daemon_reconcile_seconds` makes supervision more frequent but keeps the
per-tick cost. You cannot make the daemon "cheap per fire" — it is always
on. A purpose-built one-shot sweep is the correct tool for calendar-triggered
batch supervision.

- **Cron (`crontab -e`) instead of launchd.** cron is available on macOS
but is deprecated for user jobs and does not load automatically at login
without extra `launchctl` configuration. launchd is the correct macOS tool
and is already used for the daemon. Consistency over novelty.

- **Per-sweep-hour minute-level control.** Supporting custom minute offsets
(e.g. `:30` instead of `:00`) would complicate the spread formula and the
config knob surface with negligible practical benefit. Top-of-hour scheduling
is standard and easy to reason about.

- **Fork a new apply path for the sweep.** The sweep could have its own
apply logic tuned for batch use. Rejected — forking creates two sources of
truth for observer authority semantics. A shared path is the correct
architectural boundary.

## References

- ADR-015: `adr/015-session-observability-codex-parity.md`
- ADR-016: `adr/016-loop-control.md`
- ADR-017: `adr/017-persistent-watcher-daemon.md`
- Implementation: `src/bach/services/supervise_service.py` (run_sweep, SweepReport)
- Implementation: `src/bach/services/verdict_apply.py` (shared apply path)
- Implementation: `src/bach/cli/supervise_cmd.py` (sweep + install/uninstall)
- Implementation: `src/bach/services/launchd_service.py`
(`render_sweep_agent`, `install_sweep_agent`, `uninstall_sweep_agent`,
`_compute_sweep_hours`)
- Config: `src/bach/config/settings.py`
(`supervise_sweeps_per_day`, `supervise_window_start_hour`,
`supervise_window_end_hour`)
- How-to: `docs/how-to/run-supervision-sweeps.md`
Loading
Loading