Skip to content

docs(maintenance): design spec for workstation maintenance system#4

Open
X-McKay wants to merge 1 commit into
mainfrom
feature/maintenance-system
Open

docs(maintenance): design spec for workstation maintenance system#4
X-McKay wants to merge 1 commit into
mainfrom
feature/maintenance-system

Conversation

@X-McKay
Copy link
Copy Markdown
Owner

@X-McKay X-McKay commented May 9, 2026

Summary

  • Adds the design spec for a layered, AI-provider-agnostic workstation maintenance system
  • Substrate (bash scanners + skills + journal) lives in this repo so it survives any agent change; per-provider hooks/commands are thin shims
  • Report-only by default — never auto-deletes; on-demand commands offer act / defer / ignore per finding
  • Phased delivery (substrate → hooks → skills/commands → remaining scanners → docs) so each phase ships value standalone

Scope of this PR

Spec only. No code, scripts, hooks, systemd units, or docs/MAINTENANCE.md yet — those land in follow-up PRs once the spec is approved. The implementation plan will be drafted (via the writing-plans skill) only after the spec is reviewed here.

What's in the spec

  • Three-tier architecture: provider-neutral bash substrate, shared .agents/skills/, per-provider shims under .claude/commands/
  • Six scanners: setup-doctor, disk, docker-k8s, system-health, dev-cruft, config-drift
  • Systemd user timer (weekly, with RandomizedDelaySec and Persistent=true)
  • Two skills (maintenance-audit, maintenance-act) and three commands (/maintenance, /clean-disk, /clean-dev)
  • Two thin Claude hooks (SessionStart nudge, SessionEnd cwd logger)
  • State at ~/.local/state/maintenance/ (XDG-compliant, provider-neutral)
  • Risks and open questions called out (kubectl context timeouts, setup doctor hang risk, SessionStart latency budget, deferred-list growth)

Review focus

  • Does the scanner list cover the right gaps the existing setup CLI doesn't?
  • Are the data formats (markdown digest + JSON for skills + plain-text deferred.md) the right balance of human-readable vs. agent-parseable?
  • Phased Delivery section — does the sequencing make sense, or should phases be re-ordered?

Test plan

  • Read the spec end-to-end
  • Confirm the directory layout (versioned in setup repo vs. machine-local under ~/.local/state) looks right
  • Sanity-check the scanner descriptions against actual workstation state
  • Approve or leave inline comments before the implementation plan is written

🤖 Generated with Claude Code

Add the design spec for a layered, AI-provider-agnostic workstation
maintenance system: scheduled systemd-timer scans, on-demand slash
commands, SessionStart nudges, and a journal under
~/.local/state/maintenance. Substrate (bash scanners + skills) lives
in this repo so it survives any agent change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a design specification for a Workstation Maintenance System, a provider-agnostic framework for scanning and managing workstation health via bash scripts, systemd timers, and AI agent skills. The feedback focuses on technical refinements to the proposed implementation: improving the reliability of default branch detection in the dev-cruft scanner, correcting the JSON structure for Claude hook configurations, and suggesting dynamic path resolution in systemd unit files to avoid hardcoded repository locations.

| `disk.sh` | `~/Downloads` size + week-over-week delta; files >100MB under `$HOME` (excluding `~/.cache`, `~/.local/share`, any `node_modules`, any `.git/objects`); `node_modules` / `target/` / `.venv` directories untouched 90+ days; `/tmp` files older than 30 days. |
| `docker-k8s.sh` | `docker images -f dangling=true`, `docker volume ls -f dangling=true`, total reclaimable space from `docker system df`. For each context in `~/.kube/config`, with `kubectl --request-timeout=5s`: evicted pods, terminating pods stuck >1h. Contexts that time out are reported as a single `info` item rather than failing the scan. |
| `system-health.sh` | `systemctl --user --failed` and `systemctl --failed`; `journalctl -p err --since "7 days ago"` count per unit; `apt list --upgradable` filtered to security updates; new listening ports vs. last week's snapshot (snapshot kept under `~/.local/state/maintenance/ports-prev.txt`). |
| `dev-cruft.sh` | `~/.claude/sessions/` JSONLs >30 days; `~/.claude/plans/` and `~/.claude/todos/` >60 days; isolation-mode worktrees untouched 30+ days that are not the cwd of any session in the last 30 days of `sessions.log`; for each tracked git repo under `~/git/`, local branches fully merged into the repo's default branch (detected via `git rev-parse --abbrev-ref origin/HEAD`); cwds appearing in `sessions.log` (last 30 days) with lingering untracked files per `git status --porcelain`. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using origin/HEAD to detect the default branch is unreliable as it is not always set in local clones (it requires git remote set-head origin -a). A more robust approach would be to use git symbolic-ref --short refs/remotes/origin/HEAD or a fallback mechanism to identify the primary branch (e.g., checking for main then master).

Comment on lines +168 to +169
"SessionStart": [{ "hooks": [{ "type": "command", "command": "bash ~/.claude/hooks/session-start-nudge.sh", "timeout": 2 }] }],
"SessionEnd": [{ "hooks": [{ "type": "command", "command": "bash ~/.claude/hooks/session-end-cwd-log.sh", "timeout": 2 }] }]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The JSON structure for the Claude hooks contains an unnecessary nested hooks key. Claude Code expects an array of hook objects directly under the event name (e.g., SessionStart).

Suggested change
"SessionStart": [{ "hooks": [{ "type": "command", "command": "bash ~/.claude/hooks/session-start-nudge.sh", "timeout": 2 }] }],
"SessionEnd": [{ "hooks": [{ "type": "command", "command": "bash ~/.claude/hooks/session-end-cwd-log.sh", "timeout": 2 }] }]
"SessionStart": [{ "type": "command", "command": "bash ~/.claude/hooks/session-start-nudge.sh", "timeout": 2 }],
"SessionEnd": [{ "type": "command", "command": "bash ~/.claude/hooks/session-end-cwd-log.sh", "timeout": 2 }]


[Service]
Type=oneshot
ExecStart=%h/git/setup/bootstrap/scripts/maintenance/scan.sh
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The ExecStart path is hardcoded to %h/git/setup/.... This assumes the repository is always located at ~/git/setup. It would be more robust if the install.sh script dynamically updated this path in the unit file during installation to match the actual location of the repository.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4e4da2f8cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


| Scanner | What it checks |
|---|---|
| `setup-doctor.sh` | Shells out to `setup doctor` and `setup drift --dry-run`. Captures their summary lines as `info` or `warn` items. |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use a supported drift command

When Phase 1 implements setup-doctor.sh from this spec, setup drift --dry-run will fail because the current DriftArgs only defines --json, --dotfiles, --profiles, --profile, and the diff/sync/adopt subcommands. Since the scanner contract treats nonzero exits as scanner failures, every scheduled scan would report the setup scanner as broken instead of surfacing config drift; use the read-only default setup drift or setup drift --json here.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant