Skip to content

Auriora/agent-workbench

Repository files navigation

Agent Workbench

Agent Workbench is a local-first IDE/runtime for coding agents. It exposes repo-scoped code intelligence, documentation routing, bounded edit support, diagnostics, validation planning, workspace safety, and capability/freshness metadata through MCP, so agents can rely on mature software-engineering evidence instead of broad file reads, ad hoc shell scans, and unsupported inference.

Agent Workbench does not replace coding agents. It gives coding agents an IDE-grade evidence layer.

What It Solves

Coding agents are strongest when they can spend context on design decisions and edits instead of rediscovering repository structure. Agent Workbench provides bounded, repo-scoped evidence for common questions:

Agent problem Mature tool class Workbench role
Where is this defined? Parser/index/symbol graph Symbol search and context routing
What uses this? Reference engine References with confidence and provenance
What might break? Impact graph/test mapping Bounded impact and validation planning
Is this file valid? Parser/linter/type checker Diagnostics and planned checks
What should I test? Test discovery/dependency graph Verification plan
Can I safely edit this? Workspace safety/edit preview Preview/apply with drift checks
Is this generated/vendor/secret? Scope/catalog policy Refusal, caveats, and redaction
Where are the docs? Markdown index/outline/FTS Docs routing and section reads

Agents should not spend context and time rediscovering what mature coding support tools can already answer deterministically or semi-deterministically.

What It Exposes

The public runtime surface is MCP-first:

  • repo:///status, repo:///scope, and repo:///overview for first-read repo state, scope, freshness, and capability coverage.
  • Documentation resources and tools for bounded docs overview, map, search, outline, and section reads.
  • context_for_task for bounded task routing before broad file reads.
  • symbol_search, find_references, and impact for targeted code evidence.
  • diagnostics_for_files and verification_plan for read-only diagnostics and planned validation.
  • preview_workspace_edit and apply_workspace_edit for bounded writes with preview tokens, path containment, and drift checks.
  • Integration health/profile resources for configured, discovered, callable, unavailable, blocked, hidden, and unknown agent surfaces.

Evidence You Can Rely On

Workbench responses carry metadata so agents can calibrate claims:

  • Capability levels are semantic, partial_semantic, resource_backed, or unsupported.
  • Freshness is fresh, stale, cold, refreshing, or unknown.
  • Evidence kinds include parser, docs, FTS, config, direct reads, heuristics, text fallback, and executed commands.
  • Verification status distinguishes done, planned, needed, blocked, and not_applicable.

Routing evidence helps an agent decide where to look. Parser-backed evidence supports stronger claims about declarations and syntax. Semantic evidence supports stronger claims only when fixture-proven for that language and operation. Direct source reads remain necessary when confidence is partial, degraded, stale, or heuristic. Planned validation is not completed validation; executed tests/checks or equivalent evidence are required before claiming proof.

Not A Lifecycle Engine

Agent Workbench does not decide whether work is approved, complete, promoted, released, or closed. It provides repository evidence, coding support, validation planning, diagnostics, and workspace-safety contracts. Lifecycle tools, issue trackers, maintainers, or project governance remain responsible for intent, acceptance, and closure.

Workbench may consume active task or spec context when a lifecycle system provides it. It may rank files/docs using active spec links and expose evidence useful to lifecycle tasks. It must not require ai-spec-lifecycle or any specific lifecycle tool, decide whether a spec is complete, promote durable docs automatically, or close specs.

See Lifecycle bridge contract for the generic boundary.

Proven Use

Agent Workbench has been dogfooded on multiple repositories where coding agents used it to support feature development. Dogfood evidence should be recorded in project docs, proof matrices, or review notes rather than treated as an implicit guarantee.

Current evidence starts in:

Maintainers should add new dogfood entries to durable reference docs or proof matrices with dates, repositories, validated surfaces, limitations, and follow-up work.

Coding-Agent Workflows

Ad Hoc Direct Patch

repo status -> context_for_task -> source read -> preview edit
-> diagnostics -> validation plan -> report evidence

Check freshness before editing. Treat resource_backed, heuristic, or text_fallback evidence as routing, not proof. Report validation as planned unless checks actually ran.

Spec Or Lifecycle Task

lifecycle readiness packet -> lifecycle bridge context
-> bounded implementation -> diagnostics -> validation plan
-> lifecycle evidence update by the owning lifecycle system

Workbench consumes task context and returns repo evidence. The lifecycle system or maintainer remains responsible for acceptance, promotion, and closure.

Review-Only Task

changed files -> impact evidence -> diagnostics
-> validation adequacy -> residual risk report

Do not mutate files. Use impact and diagnostics as evidence, then call out stale indexes, partial semantic coverage, missing checks, and residual risk.

Development

Use pnpm for local development:

pnpm install
pnpm rebuild:native
pnpm typecheck
pnpm test
pnpm dev -- <repo-root>

Native tree-sitter bindings may require pnpm rebuild:native under newer Node versions. Do not add parser fallbacks to mask install/build issues.

Documentation Map

Start with Documentation map for the canonical owner of each design, contract, proof, integration, and safety topic.

Agent-visible behavior changes are tracked in Agent-readable changelog.

About

No description, website, or topics provided.

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors