Stop asking AI agents to remember process. Install it.
do-it turns AI coding discipline into an installable workflow for Codex and
Claude Code. It routes work by risk, delegates sub-agents with explicit
contracts, and requires fresh evidence before an agent can claim done.
This is the workflow I use every day for real project work. If it fits your style, use it. If something feels wrong, open an issue, send a PR, or fork it and reshape it for your own agent setup.
Every prompt is classified as Light, Standard, or Heavy before the agent
starts acting.
Light: small local edits, docs tweaks, one-off checks.Standard: normal non-trivial engineering work.Heavy: releases, architecture changes, cross-module policy, public workflow changes, or multi-agent delivery.
The point is not more ceremony. The point is matching the process to the risk: small work should stay small, and risky work should not skip planning, review, or proof.
Sub-agents are useful only when the parent gives them a real boundary. do-it
treats delegation as a contract, not a scheduling problem.
Every delegated slice pins down:
| Field | What it pins down |
|---|---|
scope |
The single bounded outcome the sub-agent owns. |
write ownership |
Which paths the sub-agent is allowed to edit. |
forbidden paths |
Which paths the sub-agent must not touch, even if it would help. |
must-verify facts |
Concrete claims the sub-agent must confirm before acting. |
stop condition |
The exact event that ends the sub-agent's run. |
return schema |
The structured shape of its final report. |
No external orchestrator is required. The parent agent stays responsible, and the contract is plain text that any host with skills and sub-agents can use.
do-it treats "done" as an evidence claim. If files changed, the agent needs
fresh verification output before it can claim the task is complete.
That keeps the closeout tied to the repository's actual state, not the agent's confidence.
Recommended when you want the full automatic workflow in Codex: skills,
agents, global hooks, and doctor managed from one explicit command.
Install the CLI globally from this GitHub repository, then run setup:
npm install -g https://github.com/tdwhere123/do-it/archive/refs/heads/main.tar.gz
do-it setupThis uses npm as the terminal installer, but the package is downloaded from
GitHub's source tarball rather than the npm registry.
do-it setup runs do-it install followed by do-it doctor.
do-it installcopies managed skills, agents, hook scripts, and Codex roothooks.jsoninto the target host.do-it doctorchecks that installed files and install state matchmanifest.json.- Codex installs to
CODEX_HOME, which defaults to~/.codex. - Codex global hooks use
UserPromptSubmit,PreToolUse,PostToolUse, andStopto route, pressure-test, refresh code maps, and require verification.
Use a temporary Codex home when testing an install:
CODEX_HOME=/tmp/do-it-codex-test do-it setupThe installer will not silently replace user-owned skill or agent files. If it
finds a target that is not already marked as do-it-managed, it stops. Set
DO_IT_FORCE=1 only when you intentionally want the package to replace those
targets.
do-it also publishes a Codex plugin marketplace shape for first-class
discovery of its skills and agents:
codex plugin marketplace add tdwhere123/do-itFor a local checkout smoke test, use the checkout path as the marketplace
source with a temporary CODEX_HOME:
CODEX_HOME=/tmp/do-it-plugin-test codex plugin marketplace add /path/to/do-itThe Codex plugin bundle lives at plugins/do-it/ and is generated from
manifest.json. It includes 20 skills and 23 agents.
For v1, pair plugin installation with do-it setup when you need enforced
automatic hooks. Local codex features list currently reports
codex_hooks=true, plugins=true, and plugin_hooks=false, so plugin-local
hooks are not the enforcement substrate.
do-it also ships as a Claude Code plugin. Install via the plugin marketplace:
/plugin marketplace add tdwhere123/do-it
/plugin install do-it
Or use the CLI target when not using marketplace:
do-it install --target=claude
do-it doctor --target=claudeThe Claude target installs to ~/.claude/ by default; override with
CLAUDE_PLUGIN_ROOT_OVERRIDE. The --with-optional flag installs any manifest
skills marked optional (none are marked optional in 0.11.0).
- do-it-native skills for routing, grill, brainstorm (multi-lens divergence), handbook (project doc skeleton), context, planning, slicing, interface / architecture / domain drills, sub-agent orchestration, TDD, debugging, review, fix loops, verification, worktree isolation, branch closeout, visual planning, and skill authoring.
- Portable Codex agent definitions for code mapping, plan challenge,
correctness review, architecture review, red-team review, spec compliance,
domain language, install/release review, documentation, testing,
language-specific drills, and brainstorm lenses: the required
product-strategist/architecture-strategistcores for product boundary, core goal, foundation, and extension shape, plus optional product, UX, end-user, ops, security, domain, and plan supplements. - Codex global hook assets and root
hooks.jsoninstalled bydo-it setup. Hooks include a PostToolUsecode-map-refreshthat marks.do-it/handbook/code-map.mdstale on barrel / migration / route / workspace-manifest edits. - Claude Code plugin assets, hooks, commands, and generated sub-agent definitions.
- Copy-based installer and
doctorcommands that validate managed host files againstmanifest.json. - Root
index.json, a generated machine-readable inventory of the do-it skills and agents for external discovery, marketplace tooling, and coverage checks. - A release surface that works from a local checkout, a packed tarball, a GitHub repository, a GitHub-backed terminal install, or a Codex plugin marketplace.
flowchart TD
P[UserPromptSubmit] --> R[do-it-router<br/>classify Light / Standard / Heavy]
R --> M{multi-lens<br/>angle needed?}
M -- yes --> BR[do-it-brainstorm<br/>product + architecture cores<br/>plus task-fit supplements]
M -- no --> G
BR --> G{premise stable?}
G -- no --> GR[do-it-grill<br/>truth-check; converges<br/>Must Resolve items]
G -- yes --> T{tier}
GR --> T
T -- Light --> L[execute]
T -- Standard --> S[inline modification map -> execute<br/>review only when risky]
T -- Heavy --> H[plan -> slicing -> drills -><br/>subagent orchestration -> review-loop -> fix-loop]
L --> PG{Edit / Write?}
S --> PG
H --> PG
PG -- Heavy or explicit --> PGY[PreToolUse: durable plan gate]
PG -- otherwise --> V[Stop]
PGY --> V
V --> VG[verification-gate:<br/>fresh evidence required]
VG --> D[done]
In practice:
do-it-routerclassifies the task and records routing state; routine Standard/Heavy turns do not emit a visible router banner.do-it-brainstormis used for product, architecture, workflow, and release-adjacent work when the route needs divergent lenses. It clarifies the requirement shape, product boundary, core goal, architecture foundation, extension modules, and option tradeoffs, then adds only the task-fit supplements needed for UX, end-user, ops, security, domain language, or plan-risk questions.do-it-grillfires when the premise needs pressure-testing. When a brainstorm artifact exists, grill enters convergence mode and resolvesMust Resolve In Grillinstead of restarting divergence.Light,Standard, andHeavyuse different flows, not the same flow at different intensities.- Heavy or explicitly durable work can be blocked at the write boundary until a plan exists.
- The stop gate checks for fresh evidence before completion claims.
Full routing policy: docs/routing-matrix.md.
- No slash command vocabulary for the automatic path. Codex global setup and the Claude Code plugin install hooks at the host lifecycle points where they matter.
- No external orchestration runtime. Sub-agent control lives in
do-it-subagent-orchestration, which is just a skill. - One-turn bypass. Include
yolo,直接做,skip grill, or/do-it-skipin the prompt to disable hooks for that turn only.
For a packed local release artifact:
npm pack
npm install -g ./tdwhere-do-it-0.10.0.tgz
do-it setupFrom a checkout, use the package entrypoint:
npm exec --package . -- do-it setup
npm exec --package . -- do-it install
npm exec --package . -- do-it doctorEquivalent package scripts are also available:
npm run setup
npm run install:do-it
npm run doctor
npm run do-it -- doctorThe shell wrappers remain for direct installer testing and delegate to the same managed install behavior:
./install/install.sh
./install/doctor.shThis package does not use npm lifecycle scripts to modify ~/.codex.
Installation into Codex happens only when the operator runs do-it setup or
do-it install.
Before sending hook changes for review, run npm run lint (shellcheck via
scripts/lint-hooks.sh). npm test runs agent schema / generated-inventory
validation, hook lint, and the hook regression suite in scripts/test-hooks.sh.
CI runs the Node matrix, generated-agent build check, Codex and Claude install
smoke tests, and package dry run on push and PR.
agents/ Portable Codex agent TOML definitions
.agents/plugins/ Codex marketplace metadata
bin/ The global do-it CLI entrypoint
commands/ Claude Code command surface
dist/claude/ Generated Claude Code agent definitions
docs/ Routing, maintenance, origin map, and release notes
hooks/ Host hook scripts
index.json Generated skill/agent discovery inventory
install/ Installer, doctor, and shell wrapper entrypoints
plugins/do-it/ Generated Codex plugin bundle
skills/custom/ Local skill examples that are not installed by default
skills/do-it/ Installed do-it-native skill directories
manifest.json Install inventory and target paths
package.json npm package metadata and CLI scripts
The private .do-it/ directory is for local plans, notes, and scratch
artifacts. It is ignored by Git and is not installed.
0.10.0 is a workflow reliability release. It keeps the skill surface stable
while tightening the actual work loop: durable plans can carry an Evidence
Ledger, readiness claims name their truth plane, review verifies the full proof
path, subagent lanes have parent-owned status, and closeout keeps source,
package, temp install, live install, and host behavior distinct.
Agent templates are now model-agnostic. Source agents/*.toml files no longer
pin concrete models or reasoning-effort fields, and Claude generated agents
inherit the running host model by default.
0.9.0 is a reliability and discipline release. Install-state migration logic
moves into install/migrate.mjs with full test coverage (npm run test-install); an unknown migration action now fails loud instead of leaving a
half-migrated state; do-it install and doctor report the install-state
version of every target so Codex/Claude version drift is visible. The
verification-gate Stop hook scopes its edit/evidence/review-loop detection to
the current turn — an earlier turn's trace can no longer pass a later
unverified turn — and its block reasons are rewritten as short
single-instruction sentences. do_it_emit_block / do_it_emit_context keep
emitting valid JSON without jq. Stale session directories are pruned after 7
days. do-it-router gains an Integrity section — a failure is a clue to
trace, not a symptom to hide — referenced by do-it-debugging,
do-it-fix-loop, do-it-verification-gate, and the subagent dispatch
contract. The CI test job now also runs on macOS. If a machine has a
cached 0.8.0 plugin, reinstall or refresh so the host loads the new hook and
skill files.
0.8.0 finishes activating the five dim_* orthogonal dimensions introduced
in 0.7.0 — each dimension now has at least one named consumer (grill-prompt,
verification-gate, or a mandatory-trigger sentence inside the matching skill).
A new advisory PostToolUse hook anti-patterns-lint flags large bash
case lists, newly-exported JS/TS symbols with no other-file consumer, and
≥5-line code blocks duplicated from a same-directory neighbour — never blocks,
single system-reminder per file. do-it-fix-loop defaults to "collect all
findings → root-cause → batch or pointwise" instead of see-one-fix-one;
do-it-review-loop correspondingly emits findings as a complete batch.
.do-it/runtime/pointer records the active task slug so a fresh turn can
recover state. do-it-comments-discipline SKILL is trimmed from 373 → 119
lines. 23 sub-agent TOMLs drop their duplicated Common protocol: block in
favour of a single reference to do-it-subagent-orchestration. If a machine
has a cached 0.7.3 plugin, reinstall or refresh so the host loads the new
hook and skill files.
0.7.3 makes the router state-only for routine Standard/Heavy turns. The
router records tier and dimensions for downstream hooks, while visible
pressure-test guidance remains in grill-prompt when a real grill trigger
fires. It also tightens workflow accountability in review, closeout, comments,
and brainstorm/grill guidance. If a machine has a cached 0.7.2 plugin,
reinstall or refresh the plugin so the host loads the new hook and skill files.
0.7.2 fixes Claude Code plugin hook compatibility with macOS's bundled Bash
3.2. If a machine has a cached 0.7.1 plugin, reinstall or refresh the plugin
so Claude loads the compatible hook files.
do-it install handles the full upgrade. No project-level migration is
required.
Hook noise reduction. Light tier is now fully silent. Standard tier requires both an intent-verb and a code-object match before injecting workflow guidance. Subagents do not receive nested hook injection. SESSION_ID validation rejects LF and control characters at the session boundary.
Session persistence. Session state moves from /tmp to
.do-it/runtime/. Skip tokens have a 5-minute TTL. A self-contained
.gitignore is written at install time. When flock is unavailable, PID-
tagged temp files and atomic mv prevent state corruption.
Research-first architecture decisions. architecture-strategist now
requires a live search and at least two concrete candidates before
recommending. A new architecture-taste-reviewer agent audits brainstorm
output for research compliance. Search results are treated as an untrusted
boundary to prevent prompt injection.
Comments discipline. Five comment types are allowed — type annotation,
@anchor, see also, invariant, tool directive — and six are forbidden —
narrative, history, task-reference, tombstone, orphan-TODO, what-comment. A
comments-lint PostToolUse hook provides advisory reminders. Agents should
apply do-it-comments-discipline before authoring comments, and the
review-loop comments lens is the acceptance gate.
Anti-pattern lint. A second advisory PostToolUse hook
anti-patterns-lint flags three coarse code-quality anti-patterns on the
lines an edit just added: large bash case lists (≥10 consecutive
*"..."* branches), newly-exported JS/TS symbols with no other-file
consumer, and ≥5-line code blocks duplicated from a same-directory
neighbour. Like comments-lint it never blocks and emits a single
system-reminder per edit; suppress per-edit with the literal string
anti-patterns-lint-allow in the new lines.
Decision and proof-path coverage. Grill now asks one load-bearing decision at a time with context, options, tradeoffs, and a recommended default. Review starts from the user goal and proof path before local diff snippets, so missing decision coverage, unwired implementation, unused delivered surfaces, and synthetic proof become review findings.
Router dimension orthogonality. Five dim_* booleans (touches_code,
crosses_packages, breaks_interface, needs_tdd, needs_review_loop) are
written to session state per task. Tier classification is unchanged; the
booleans drive which workflow steps fire.
Workflow accountability. Hook reminders are not proof that the workflow ran. For non-trivial work, closeout should state which brainstorm, grill, subagent, review, and verification steps were used or skipped, with the reason.
Graduated review. Three review depths: review-quick, review-deep,
review-adversarial. The verification gate on Light tier with edits now emits
an inline-review marker to prevent a self-satisfying replay claim.
Lazy skill loading. A generated dist/claude/skills/_index.md (~720
tokens) replaces the large skill catalogue previously injected by the router.
Skills load on demand.
Subagent token budgets. Codex agent TOML stays schema-clean: no
output_budget, claude_model, or other host-private keys. Response budgets
live in do-it-subagent-orchestration and must be passed in the parent prompt.
Model-agnostic agents. Agent templates do not pin concrete model names or reasoning-effort fields. Codex TOML stays portable, and Claude generated agents inherit the running host model unless a tested host compatibility fallback is needed.
Evidence ledger and truth planes. Heavy, release/install, multi-agent, and explicit durable-plan work records claim rows in the plan's verification section. Source repo, worktree, package artifact, temp install, live Codex, live Claude, and host behavior are separate truth planes.
Subagent lane status. Parent agents track delegated lanes as assigned,
running, done_with_evidence, integrated, or blocking. Worker DONE
does not become a final claim until the parent inspects, reviews, and verifies
the integrated surface.
Test coverage. Regression cases in tests/hooks/ cover common,
router, verification-gate (incl. DIM consumers), comments-lint, and
anti-patterns-lint. scripts/test-hooks.sh also exercises dim-aware
grill suppression.
Debugging hooks: DO_IT_DEBUG=1 makes each hook emit one stderr line per
decision (escape / skip / question / tier / trigger / evidence). Inspect
session state with do-it doctor --session=<id>.
do-it builds on the plan / subworker / TDD / review pattern that two
high-quality projects already proved out:
obra/superpowers: skill + subworker collaboration model.mattpocock/skills: skill packaging and discovery.addyosmani/agent-skills: production-skill anatomy, anti-rationalization, and evidence-first method discipline.
do-it is my own take on the same problem space, shaped by what I learned from
those projects and from daily use on real work. It rewrites methods into
do-it-native Router / Tier / Skill language; it does not vendor upstream skill
text or install upstream skill names.
Thanks also to the Linux.do community. The conversations there are a steady source of practical agent-workflow feedback and ideas.
Use docs/maintenance.md when changing skills, agents, installer behavior, or package metadata. In short:
- Edit the maintained repository copy.
- Update
manifest.jsonwhen install inventory changes. - Keep
docs/routing-matrix.mdaligned with routing or closeout policy changes. - Verify with a temporary
CODEX_HOME. - Publish only after the packed package contains the expected files.
Useful release checks:
git diff --check
npm test
npm run validate:agents
npm run build:claude-agents
npm run build:codex-plugin
CODEX_HOME=/tmp/do-it-codex-test npm exec --package . -- do-it setup
CODEX_HOME=/tmp/do-it-codex-test npm exec --package . -- do-it doctor
CODEX_HOME=/tmp/do-it-plugin-test codex plugin marketplace add /path/to/do-it
CLAUDE_PLUGIN_ROOT_OVERRIDE=/tmp/do-it-claude-test npm exec --package . -- do-it setup --target=claude
npm pack --dry-run --jsonUse do-it as-is, send focused improvements, or fork it into your own
workflow. The only hard requirement for changes here is that they come from
real use.
See CONTRIBUTING.md for the two hard rules (dogfood-first, Issue-first), the exception list (typo / translation / reproducible bug fix), and the PR template.