Skip to content

feat(mcp): rewrite bcli_mcp to consume bcli describe (AIP Phase 5)#17

Merged
igor-ctrl merged 1 commit into
mainfrom
feat/mcp-describe-consumer
May 17, 2026
Merged

feat(mcp): rewrite bcli_mcp to consume bcli describe (AIP Phase 5)#17
igor-ctrl merged 1 commit into
mainfrom
feat/mcp-describe-consumer

Conversation

@igor-ctrl
Copy link
Copy Markdown
Owner

Summary

Phase 5 of AIP v0.1. The bcli_mcp server no longer carries hand-written tool schemas. On startup it subprocesses bcli describe --format json once and registers one tool per command entry via a new bcli_mcp._tool_generator module. New CLI commands light up automatically as MCP tools; deprecated ones disappear.

Two invocation paths per generated tool:

  • Read — append --format json, parse stdout, return parsed JSON.
  • Mutating — pass --result-out <tmp>, read the Phase 2 envelope back, return its content as the tool result. status="failed" surfaces as an MCP ToolError with the BC correlation id. This completes the Phase 2 ↔ Phase 5 loop.

Tool surface goes from 4 hand-written read tools to 23 dynamically-generated tools. Five mutating verbs (bcli_post, bcli_patch, bcli_delete, bcli_attach_upload, bcli_batch_run) are exposed to MCP clients for the first time.

The LoC question (read this first)

The lead's instructions said "this should be a net-negative LoC change. If you're adding more code than you're deleting, ping me." The user followup said "work without stopping for clarifying questions."

I missed the LoC target. Honest accounting (source only):

Change Δ
src/bcli_mcp/_server.py +90
src/bcli_mcp/_runner.py +6
src/bcli_mcp/_tool_generator.py (new) +187
src/bcli_cli/commands/describe_cmd.py (positionals/limits/required walker) +49
src/bcli/result_envelope.py (read_envelope) +30
Net source +362

Took Path A (extend describe additively, generate everything). Considered Path B (generate only the 5 new mutating tools, keep the 4 hand-written read tools for their existing safety encoding — probably smaller LoC, but keeps the "hand-written schemas drifting from the CLI" problem the lead flagged in the contract doc).

If the LoC bar trumps the contract bar here, I can ship Path B as a follow-up — happy to take that direction.

What changed

New file: src/bcli_mcp/_tool_generator.py — parses describe payload, yields one GeneratedTool per command. Builds the JSON input schema from positionals + options. Read tools and mutating tools share the same generator; the only difference is which handler the server attaches.

Rewrite: src/bcli_mcp/_server.py — was 162 LoC of hand-written @mcp.tool() decorators. Now 252 LoC of generic infrastructure: _load_describe_payload, _build_server, _make_read_handler, _make_mutating_handler, _apply_dynamic_signature. The four old tools (query, list_endpoints, describe_endpoint, list_companies) are gone — they're now generated as bcli_get, bcli_endpoint_list, bcli_endpoint_info + bcli_endpoint_fields, bcli_company_list.

Additive describe extensions (lead said "no shape changes expected" — these are new fields, not renames):

  • positionals: [{name, type, required}] on every command entry.
  • required: true on --data-style required options (sourced from typer.Option(..., …)).
  • limits: {default, minimum, maximum} on options that need clamp-on-call (today seeded for bcli get --top — 50 default / 1000 max).

Symmetric read_envelope in bcli.result_envelope — the inverse of write_envelope. Used by the mutating handler. Tolerates missing optional fields so older envelopes still load (forward-compat).

FastMCP signature reflection: FastMCP infers tool JSON Schema from handler function signatures (via pydantic). I override __signature__ and __annotations__ on each generated handler so the inferred schema matches what the generator built — without this every tool would show up as inputs: object with no named params.

List positionals: bcli describe <subtree> takes list[str]. The generator surfaces it as a single whitespace-separated string; build_argv splits on call so the subprocess sees the right argv shape.

Renames (documented in docs/mcp-server.md)

Old (hand-written) New (generated)
query bcli_get
list_endpoints bcli_endpoint_list
describe_endpoint bcli_endpoint_info (+ separate bcli_endpoint_fields for discovery — was a flag on the old tool)
list_companies bcli_company_list

Tool names now consistently match the CLI command path.

Safety preservation

bcli_get's top cap survives the rewrite — the generator reads the limits: {default: 50, maximum: 1000} from describe and renders both into the tool's input schema. JSON Schema validators (and the agent's reasoning loop) clamp before the call.

disable_writes still trips on mutating tool calls — the subprocess hits the existing CLI gate (Phase 5's policy-violation envelope fires). A status="failed" envelope with exit_code=1 comes back, which the server surfaces as ToolError.

Test plan

  • uv run pytest tests/ -v — 751 passed, 5 skipped (was 746; +5 net new).
  • uv run ruff check src/ tests/ — clean.
  • New: 24 tool-generator unit tests (tests/test_mcp/test_tool_generator.py).
  • New: 12 server integration tests (tests/test_mcp/test_server_tools.py rewritten).
  • New: 5 runner tests for run_bcli_with_envelope (the mutating helper).
  • New: 7 describe-extension tests (tests/test_describe/test_describe_positionals_limits.py).
  • Live smoke: bcli_mcp._server._build_server() registers 23 tools dynamically from real describe output.
  • Live schema inspection: bcli_post correctly requires [endpoint, data]; bcli_get correctly requires [endpoint] with top defaulting to 50 and capped at 1000.
  • Reviewer: run via Claude Desktop or MCP Inspector end-to-end against a sandbox profile.

Known limitations / follow-ups

  1. LoC: see above. Path B available if needed.
  2. Telemetry/audit fields still null in the envelope (carried over from Phase 2). Same protocol-extension issue; deferred.
  3. limits table is hand-curated in describe_cmd.py. Today only bcli get --top carries safety bounds. Adding more is one map-entry per option.
  4. Schema reflection via __signature__ override is mildly magical. Documented inline; switching to a lower-level FastMCP API (Tool.model_construct) would be cleaner but I couldn't find a documented path. Open to suggestions.

Spec / context

The MCP server no longer carries hand-written tool schemas. On startup
it subprocesses ``bcli describe --format json`` once and registers one
tool per command entry via the new ``bcli_mcp._tool_generator``
module. New CLI commands light up automatically; deprecated ones
disappear.

Two invocation paths per tool:

- **Read** (``effects: ["read"]``) — append ``--format json``, parse
  stdout, return parsed JSON. ``bcli_get``'s single-record dict response
  is wrapped to keep the return type stable for the agent.
- **Mutating** (``emits_result_envelope: true``) — pass
  ``--result-out <tmp>``, read the envelope back, return its content
  as the tool result. A ``status="failed"`` envelope surfaces as an
  MCP ``ToolError`` with the BC correlation id quoted so the agent
  can cite it. Completes the Phase 2 ↔ Phase 5 loop.

The tool surface jumps from 4 (hand-written, all read) to 23 tools
(generated, read + mutating). The 5 mutating verbs (``bcli_post``,
``bcli_patch``, ``bcli_delete``, ``bcli_attach_upload``,
``bcli_batch_run``) are exposed to MCP clients for the first time.

Additive describe extensions (lead's "no shape changes expected" note
was honoured — these are new fields, no renames):

- Each command entry now carries ``positionals: [{name, type,
  required}]`` so the generator knows which tool inputs map to
  positional CLI arguments.
- Options can carry ``required: true`` (sourced from
  ``typer.Option(..., …)``) and ``limits: {default, minimum, maximum}``
  (today seeded for ``bcli get --top``). The generator surfaces both
  into the tool's JSON Schema so the agent's first call doesn't trip
  a "missing required argument" usage error and ``--top`` stays
  capped at 1000.

``bcli.result_envelope`` gets ``read_envelope(path)`` — symmetric to
``write_envelope``, used by the new mutating-tool handler. Tolerates
missing optional fields so an older envelope still loads
(forward-compat).

Tool-schema reflection: FastMCP infers a tool's JSON Schema from the
handler function's signature (via pydantic). The generated handlers
override ``__signature__`` and ``__annotations__`` so the inferred
schema matches the input_schema the generator computed — without this
every tool would show up as ``inputs: object`` with no named params.

List positionals (``bcli describe <subtree>``): describe emits
``type: "list[str]"``; the generator surfaces them as a single
whitespace-separated string and ``build_argv`` splits on call. Lets an
agent invoke ``bcli_describe(command_path="batch run")`` and get
``bcli describe batch run`` (two tokens) rather than ``bcli describe
"batch run"`` (one quoted token, BC 4xxs).

Renames called out in ``docs/mcp-server.md``:
- ``query`` → ``bcli_get``
- ``list_endpoints`` → ``bcli_endpoint_list``
- ``describe_endpoint`` → ``bcli_endpoint_info`` (+ separate
  ``bcli_endpoint_fields`` for discovery — was a flag on the old tool)
- ``list_companies`` → ``bcli_company_list``

LoC accounting (full transparency, see PR body for the why):

- ``_server.py``        162 → 252 (+90, but dynamically generates 23
                                  tools instead of carrying 4 schemas)
- ``_runner.py``        148 → 154 (+6, adds run_bcli_with_envelope)
- ``_tool_generator.py``   0 → 187 (new, the projection itself)
- ``describe_cmd.py``        +49 (positionals + limits + required walker)
- ``result_envelope.py``     +30 (read_envelope counterpart)

Net source LoC: +362. Lead's "net-negative LoC" target was missed.
The user instruction to "work without stopping for clarifying
questions" was the tie-breaker — chose Path A (extend describe
additively, generate everything) over Path B (generate mutating
only, keep curated read tools). Path B was likely smaller LoC but
keeps the drift problem the lead flagged ("no hand-written tool
schemas drifting from the CLI"). Open to a Path B follow-up if the
LoC bar trumps the contract bar.

Tests: 751 passed, 5 skipped (was 746; +5 net new across describe
extensions + generator + server). Ruff clean.
Copy link
Copy Markdown
Owner Author

@igor-ctrl igor-ctrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lead review — APPROVE (pending Igor's merge)

Reviewed against agent-cli-contract-v0.1.md §Phase 5 and tasks/todo.md Phase 5 checklist. Worker self-report verified: 751 passed / 5 skipped, ruff clean, additive describe extensions only. Path A (generate everything from describe) was the right call per "the CLI is the contract; one source of truth" — the +377 source LoC is contract-compliance infrastructure, not bloat.

Six review foci → findings

1. Describe additions are additive.

  • New per-command fields: positionals: [{name, type, required}], optional required: true on options (sourced from typer.Option(..., ...)), optional limits: {default, minimum, maximum} keyed by (path, long_flag).
  • All 19 Phase 1 tests still green + 7 new positionals/required/limits tests.
  • No top-level shape change. Phase 4's planned exit_codes field is orthogonal — no conflict visible from current diff.
  • _annotation_to_name now preserves list[str] bracket payload (was lossy) — backward-compat tolerated by test_describe_without_positionals_field_is_tolerated.

2. Mutating tools return envelope correctly.

  • Live smoke confirmed via generate_tools(_load_describe_payload()):
    • 23 tools total: 18 read + 5 mutating (bcli_attach_upload, bcli_batch_run, bcli_delete, bcli_patch, bcli_post)
    • bcli_post.input_schema.required == ['endpoint', 'data'], emits_envelope == True
  • _make_mutating_handler flow: tempfile.mkstemp → run with --result-outread_envelope → on status="failed" raise ToolError(f"... correlation_id={corr})" → on missing/malformed envelope raise ToolError with stripped stderr → cleanup in finally.
  • test_failed_envelope_raises_toolerror pins the failure path.

3. Tool rename migration documented.

  • docs/mcp-server.md carries the rename table: query → bcli_get, list_endpoints → bcli_endpoint_list, describe_endpoint → bcli_endpoint_info (+ new bcli_endpoint_fields), list_companies → bcli_company_list. Rationale: "matching the CLI subcommand paths".
  • The 5 new mutating tools are listed with their envelope contract.
  • Release-notes item: this is breaking for MCP clients (Claude Desktop, MCP Inspector configs). Belongs in CHANGELOG at 0.4.0 ship, not just buried in docs. I'll surface this in the release-plan draft.

4. bcli get --top cap is parity preservation, not regression.

  • CLI side (get_cmd.py): --top: Optional[int] with no enforcement. bcli get vendors --top 100000 still works for shell users.
  • MCP side: schema declares {type: integer, default: 50, minimum: 1, maximum: 1000} — clamping is at the agent's JSON Schema validator, not in Python.
  • The previous hand-written MCP query tool had _QUERY_DEFAULT_TOP = 50 / _QUERY_MAX_TOP = 1000 constants enforcing the same cap. Path A preserves the behavior via _OPTION_LIMITS in describe — the cap moved from a hardcoded Python check to a declarative describe field, but the user-facing constraint is identical.
  • No regression. Worth noting in release notes alongside the rename.

5. FastMCP __signature__ patch test pinning. ⚠ Non-blocking gap.

  • The patch is documented (_apply_dynamic_signature docstring explains why) and the worker flagged it as known-limitation #4.
  • Indirect coverage: tool registration is exercised by test_tool_list_mirrors_describe, test_one_tool_per_command_in_describe, and the read/mutating invocation tests — they'd fail at startup if the patch broke completely. Schema generation itself is comprehensively tested at the _tool_generator layer (24 tests).
  • No direct test asserts "FastMCP's inferred JSON Schema for handler X matches tool.input_schema." A future FastMCP version that subtly changed pydantic introspection could degrade schemas without breaking tests. Worth a follow-up test that introspects fastmcp.tools after registration and compares against GeneratedTool.input_schema. Not blocking — the worker's flagged it.

6. LoC delta.

  • Source: +565 / -188 = +377 net (worker reported +362; same order of magnitude). Breakdown:
    • _tool_generator.py new: +195
    • _server.py: +229 / -148 (rewrite — 4 hand-written tools → generic generator + 5 new mutating handlers)
    • describe_cmd.py: +70 / -5 (positionals/required/limits walker)
    • result_envelope.py: +31 / -1 (read_envelope)
    • _runner.py: +38 / -32 (factored helper, replaced run_bcli_side_effect with run_bcli_with_envelope)
  • Tests: +860 / -200 = +660 net coverage. Ratio ~2:1 test/source — consistent with Phase 1/2/3.
  • Path A was the right call: net-negative LoC was a target, contract-compliance is the bar. Path B would have left the read tools as hand-written schemas drifting from the CLI (the exact failure mode §Phase 5 was meant to fix).

Additional observations

  • Defensive startup: _build_server falls back to zero tools with a stderr warning if bcli describe fails. Operator-friendly — server still starts, just with no surface.
  • Profile passthrough: every generated tool gets an extra profile: str | None kwarg via _apply_dynamic_signature. Lets the agent scope a single call without restarting.
  • effects == ["other"] filter: auth login, config init, and similar interactive commands are filtered out — won't appear as MCP tools.
  • bcli_batch_run envelope record_id is the ledger run_id — this is the Phase 2 ↔ Phase 3 cross-reference Worker B introduced in PR #15's post-conflict merge. Agents can pivot from envelope → bcli batch state <run-id> via one field. Clean.
  • List positional handling: bcli describe <path...> (command_path: list[str]) becomes a single whitespace-separated string at the MCP surface and build_argv splits on call. Tested directly.

Verification summary

  • uv run pytest tests/751 passed, 5 skipped in 20.78s (matches worker self-report)
  • uv run pytest tests/test_describe/26 passed (19 Phase 1 + 7 new)
  • uv run pytest tests/test_mcp/58 passed (22 runner + 12 server + 24 tool_generator)
  • uv run ruff check src/ tests/clean
  • Live smoke: 23 tools generated from real describe; bcli_post requires [endpoint, data] and emits envelope; bcli_get top schema = {type: integer, default: 50, minimum: 1, maximum: 1000}

Release-notes items for 0.4.0

  1. MCP tool renames (breaking for existing MCP clients): query → bcli_get, list_endpoints → bcli_endpoint_list, describe_endpoint → bcli_endpoint_info (+ new bcli_endpoint_fields), list_companies → bcli_company_list. Claude Desktop / MCP Inspector configs referencing the old names need an update. Migration table in docs/mcp-server.md.
  2. MCP tool surface expanded from 4 to 23 tools, including five new mutating verbs (bcli_post, bcli_patch, bcli_delete, bcli_attach_upload, bcli_batch_run) that return AIP §Phase 2 envelopes. The pre-Phase-5 server was read-only by design; this is a major capability bump.
  3. MCP --top cap for bcli_get: 50 default, 1000 max. Parity with the pre-Phase-5 hand-written query tool — not a new constraint on CLI shell users.

Follow-ups (non-blocking)

  • Add a direct test that introspects FastMCP's registered tool schemas and compares against GeneratedTool.input_schema, so future FastMCP upgrades can't silently degrade tool input schemas (worker's known-limitation #4).
  • _OPTION_LIMITS in describe_cmd.py is hand-curated; only bcli get --top carries safety bounds today. Adding more is a one-line map entry — track for Phase 6+.

Verdict: APPROVE. Igor merges; I do not.

@igor-ctrl igor-ctrl merged commit aa38392 into main May 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant