feat: add agent-friendly Modly CLI#144
Conversation
|
I think the idea is useful: Modly does need an agent-friendly, JSON-first way to automate common operations without driving the UI manually. After reviewing this against the current dev branch and the existing external MCP/CLI direction, my main concern is not the usefulness of the tool, but the contract it creates. Right now this PR adds a new Python CLI surface that talks mostly to existing REST endpoints such as:
That is practical, but it also risks becoming a third parallel automation path next to:
I would suggest keeping the useful CLI ergonomics, but aligning the contract before treating this as an official automation interface. Concretely, I think the CLI should be shaped around canonical primitives: health
model
workflow-run
process-run
capabilityAnd keep the current /generate/* flow as legacy compatibility rather than the primary path. 1. Prefer workflow runs for generationInstead of making generate primarily wrap: POST /generate/from-image
GET /generate/status/{job_id}prefer: POST /workflow-runs/from-image
GET /workflow-runs/{run_id}
POST /workflow-runs/{run_id}/cancelThe JSON output should include recovery metadata so agents can resume/poll safely: {
"ok": true,
"run": {
"kind": "workflowRun",
"id": "..."
},
"meta": {
"status_command": "workflow-run status ...",
"legacy": false
}
}generate can still exist as a friendly alias, but it should be clearly documented as either canonical-through-workflow-run or legacy if it falls back to /generate/*. 2. Use process/capability discovery instead of heuristicsThe current approach appears to infer some behavior from model names / labels, especially around texture/refine-style flows. For agents, that is fragile. The CLI should avoid guessing hidden capabilities from strings and instead use explicit discovery when available, e.g.: GET /automation/capabilities
POST /process-runs
GET /process-runs/{run_id}If a process is not available through a canonical contract, it should fail closed: {
"ok": false,
"code": "UNSUPPORTED_PROCESS",
"message": "This process is not available through the canonical process-run contract."
}3. Validate model IDs canonicallyAny model_id should be validated against: GET /model/allIf --model auto remains, it should use verified defaults/capability metadata, not string matching or first available model behavior. 4. Clarify readiness boundariesGET /health should happen before business operations. Also, starting FastAPI through serve / ensure-server should be clearly marked as developer-only. A FastAPI-only process does not imply Electron/Desktop bridge readiness, scene operations, extension process execution, or full workflow support. I would suggest moving these commands under something like: dev serve-api
dev ensure-serveror documenting them explicitly as non-canonical dev helpers. 5. Keep ComfyUI out of the canonical Modly agent contractThe ComfyUI integration may be useful, but it feels like an external orchestration helper rather than part of the core Modly automation contract. I would suggest moving it under: experimental comfy-imageor splitting it from the main supported CLI surface. 6. Align with existing MCP directionSettings > Agent already points users toward modly-cli-mcp / modly-mcp for external agents. To avoid confusing users and future maintainers, this CLI should either:
The important part is avoiding a second canonical-looking automation contract with different semantics. 7. PR hygieneSince the feature is Python-only, the package-lock.json changes look unrelated unless there is a specific reason for them. I would recommend removing that noise from this PR. Overall: I like the goal and I think the CLI can be valuable. I would just prefer to land it with a clearer contract boundary:
That would preserve the useful agent ergonomics while keeping Modly automation aligned with Agent Chat, MCP, and future workflow/process execution. |
|
Sounds good! I’m going to keep the CLI ergonomics but narrow the contract so it aligns with Modly’s canonical automation direction. I’ll make workflow runs the primary generation path, include recovery metadata in JSON output, validate explicit models through /model/all, remove texture/refine string heuristics, make process/capability commands fail closed when the canonical server contract is unavailable, move headless API startup under dev, move ComfyUI helpers under experimental, and keep /generate/* only as explicit legacy compatibility. I’ll also remove the unrelated package-lock.json changes and ensure generated E2E artifacts stay ignored and out of history. |
|
Fixes applied @DrHepa! Let me know what you think :) |
|
Hey ! Good work, but I still have a few points to ask you about before merging. Required before approval
A single choke point in _export_workspace_path (called by cmd_export, _generate_one, and cmd_legacy_generate) covers
Suggest a final except Exception as exc: in main() emitting {"ok": false, "code": "UNEXPECTED_ERROR", "message": Small fix
Please confirm
|
|
Addressed these in 9c92bca:
Tests pass: python tools/modly-cli/test_agent.py now runs 30 tests successfully Let me know if there's anything else :) |
|
Almost there ! One last consistency gap before I approve. In _workflow_workspace_path(), the output_url branch is validated but the def _workflow_workspace_path(status: dict[str, Any]) -> str:
scene_candidate = status.get("scene_candidate")
if isinstance(scene_candidate, dict) and scene_candidate.get("workspace_path"):
return str(scene_candidate["workspace_path"]) # not validated
output_url = str(status.get("output_url") or "")
if output_url:
return _workspace_relative_path(output_url) # validatedWhy it matters: _export_workspace_path re-validates downstream, so a real export still rejects a malicious value. But Suggested fix — validate at the source so both branches behave consistently: return _validate_workspace_path(str(scene_candidate["workspace_path"]))A small test covering a scene_candidate.workspace_path containing .. (expecting INVALID_WORKSPACE_PATH) would round it |
|
Alright, covered it! I appreciate your patience with me! |
Adds a simple CLI tool so AI agents can have an easier time interacting with modly. Includes a SKILL.md file as well.