feat: add agent-friendly Modly CLI by jokrasno · Pull Request #144 · lightningpixel/modly

jokrasno · 2026-05-12T00:33:00Z

Adds a simple CLI tool so AI agents can have an easier time interacting with modly. Includes a SKILL.md file as well.

DrHepa · 2026-05-22T15:20:28Z

I think the idea is useful: Modly does need an agent-friendly, JSON-first way to automate common operations without driving the UI manually.

After reviewing this against the current dev branch and the existing external MCP/CLI direction, my main concern is not the usefulness of the tool, but the contract it creates.

Right now this PR adds a new Python CLI surface that talks mostly to existing REST endpoints such as:

/health
/model/*
/generate/from-image
/generate/status/{job_id}
/generate/cancel/{job_id}
/export/{format}

That is practical, but it also risks becoming a third parallel automation path next to:

the in-app Agent Chat (/agent/chat + Ollama tool-calling),
the existing MCP surfaces / modly-cli-mcp integration shown in Settings,
the newer workflow-runs / process/capability direction.

I would suggest keeping the useful CLI ergonomics, but aligning the contract before treating this as an official automation interface.

Concretely, I think the CLI should be shaped around canonical primitives:

health
model
workflow-run
process-run
capability

And keep the current /generate/* flow as legacy compatibility rather than the primary path.

1. Prefer workflow runs for generation

Instead of making generate primarily wrap:

POST /generate/from-image
GET /generate/status/{job_id}

prefer:

POST /workflow-runs/from-image
GET /workflow-runs/{run_id}
POST /workflow-runs/{run_id}/cancel

The JSON output should include recovery metadata so agents can resume/poll safely:

{
  "ok": true,
  "run": {
    "kind": "workflowRun",
    "id": "..."
  },
  "meta": {
    "status_command": "workflow-run status ...",
    "legacy": false
  }
}

generate can still exist as a friendly alias, but it should be clearly documented as either canonical-through-workflow-run or legacy if it falls back to /generate/*.

2. Use process/capability discovery instead of heuristics

The current approach appears to infer some behavior from model names / labels, especially around texture/refine-style flows.

For agents, that is fragile. The CLI should avoid guessing hidden capabilities from strings and instead use explicit discovery when available, e.g.:

GET /automation/capabilities
POST /process-runs
GET /process-runs/{run_id}

If a process is not available through a canonical contract, it should fail closed:

{
  "ok": false,
  "code": "UNSUPPORTED_PROCESS",
  "message": "This process is not available through the canonical process-run contract."
}

3. Validate model IDs canonically

Any model_id should be validated against:

GET /model/all

If --model auto remains, it should use verified defaults/capability metadata, not string matching or first available model behavior.

4. Clarify readiness boundaries

GET /health should happen before business operations.

Also, starting FastAPI through serve / ensure-server should be clearly marked as developer-only. A FastAPI-only process does not imply Electron/Desktop bridge readiness, scene operations, extension process execution, or full workflow support.

I would suggest moving these commands under something like:

dev serve-api
dev ensure-server

or documenting them explicitly as non-canonical dev helpers.

5. Keep ComfyUI out of the canonical Modly agent contract

The ComfyUI integration may be useful, but it feels like an external orchestration helper rather than part of the core Modly automation contract.

I would suggest moving it under:

experimental comfy-image

or splitting it from the main supported CLI surface.

6. Align with existing MCP direction

Settings > Agent already points users toward modly-cli-mcp / modly-mcp for external agents. To avoid confusing users and future maintainers, this CLI should either:

explicitly stay as a repo-local/dev helper, or
align its naming and JSON semantics with the existing MCP concepts: workflowRun, processRun, capability, recovery metadata, and fail-closed unsupported operations.

The important part is avoiding a second canonical-looking automation contract with different semantics.

7. PR hygiene

Since the feature is Python-only, the package-lock.json changes look unrelated unless there is a specific reason for them. I would recommend removing that noise from this PR.

Overall: I like the goal and I think the CLI can be valuable. I would just prefer to land it with a clearer contract boundary:

canonical: workflow-run, process-run, capability
legacy: /generate/* and old job polling
dev-only: server startup helpers
experimental: ComfyUI orchestration

That would preserve the useful agent ergonomics while keeping Modly automation aligned with Agent Chat, MCP, and future workflow/process execution.

jokrasno · 2026-05-29T18:26:33Z

Sounds good! I’m going to keep the CLI ergonomics but narrow the contract so it aligns with Modly’s canonical automation direction. I’ll make workflow runs the primary generation path, include recovery metadata in JSON output, validate explicit models through /model/all, remove texture/refine string heuristics, make process/capability commands fail closed when the canonical server contract is unavailable, move headless API startup under dev, move ComfyUI helpers under experimental, and keep /generate/* only as explicit legacy compatibility. I’ll also remove the unrelated package-lock.json changes and ensure generated E2E artifacts stay ignored and out of history.

jokrasno · 2026-06-02T03:41:01Z

Fixes applied @DrHepa! Let me know what you think :)

lightningpixel · 2026-06-02T17:01:15Z

Hey ! Good work, but I still have a few points to ask you about before merging.

Required before approval

Strict workspace-path validation (security).
Since this tool is designed to be driven by autonomous agents, any path the CLI accepts or derives must be rejected if
it's absolute, contains .. traversal, or is otherwise not workspace-relative. Right now nothing is validated:

cmd_export forwards --path as-is into /export/{fmt}?path=... — --path ../../etc/passwd or an absolute path goes
straight to the server.
_workspace_relative_path derives the path from the server-returned output_url (incl. unquote), so a malicious/buggy
server response would be re-requested verbatim.

A single choke point in _export_workspace_path (called by cmd_export, _generate_one, and cmd_legacy_generate) covers
all cases; adding the same guard in _workspace_relative_path gives defense-in-depth. Please reject absolute paths,
leading / or , Windows drive letters (C:), and any .. component, with a clear INVALID_WORKSPACE_PATH JSON error —
and add a couple of tests for those cases.

Guarantee the "always parsable JSON" contract on unexpected errors.
main() only catches ModlyCliError and KeyboardInterrupt. A few paths raise un-wrapped exceptions that produce a
traceback instead of JSON, which breaks the promise made to agents:

_multipart_form → file_path.read_bytes(): OSError/PermissionError on the input image.
_parse_params → Path(params_file).read_text(): OSError on an unreadable params file.
Module load: int(os.environ.get("MODLY_CLI_TIMEOUT", "1800")) raises ValueError at import if the env var is
malformed.

Suggest a final except Exception as exc: in main() emitting {"ok": false, "code": "UNEXPECTED_ERROR", "message":
str(exc)} and returning 1.

Small fix

process-run recovery metadata is inconsistent. _recovery_meta injects --base-url into the status/cancel
commands when the URL differs from the default, but cmd_process_run_start builds those commands hardcoded without
--base-url, so an agent on a custom base URL gets non-replayable recovery commands. Please route it through
_recovery_meta.

Please confirm

The README rename launcher.bat → launch.bat is bundled into this feature PR — is it intentional and does it match
the actual filename? If unrelated, it'd be cleaner as its own commit.

Hidden aliases (comfy-image, generate-from-workflow, serve, ensure-server) re-declare their arguments manually — a
shared registration helper would prevent the two copies from drifting.
sub._choices_actions = [...] relies on an argparse private attribute; a short # relies on argparse internals comment
would help future readers.
cmd_batch restores args.format without try/finally (dirty namespace if a non-ModlyCliError is raised).

jokrasno · 2026-06-02T17:18:05Z

Addressed these in 9c92bca:

Added strict workspace-relative path validation with INVALID_WORKSPACE_PATH for export paths and server-derived output paths
Added JSON fallback for unexpected exceptions via UNEXPECTED_ERROR.
Hardened malformed timeout/poll env var handling at import
Fixed process-run recovery metadata to preserve custom --base-url.
Added tests for the security/error-handling cases
Confirmed launch.bat is the actual filename in the repo
Also added the argparse internals comment and cmd_batch try/finally cleanup

Tests pass: python tools/modly-cli/test_agent.py now runs 30 tests successfully

Let me know if there's anything else :)

lightningpixel · 2026-06-02T22:05:16Z

Almost there !

One last consistency gap before I approve. In _workflow_workspace_path(), the output_url branch is validated but the
scene_candidate.workspace_path branch is returned raw:

 def _workflow_workspace_path(status: dict[str, Any]) -> str:
      scene_candidate = status.get("scene_candidate")
      if isinstance(scene_candidate, dict) and scene_candidate.get("workspace_path"):
          return str(scene_candidate["workspace_path"])        # not validated
      output_url = str(status.get("output_url") or "")
      if output_url:
          return _workspace_relative_path(output_url)          # validated

Why it matters: _export_workspace_path re-validates downstream, so a real export still rejects a malicious value. But
when export is skipped (--no-export, or workflow-run start --wait), this raw path is surfaced in the output JSON
workspace_path field as if trusted, and could be reused by an agent in a later call.

Suggested fix — validate at the source so both branches behave consistently:

return _validate_workspace_path(str(scene_candidate["workspace_path"]))

A small test covering a scene_candidate.workspace_path containing .. (expecting INVALID_WORKSPACE_PATH) would round it
out

jokrasno · 2026-06-02T23:38:48Z

Alright, covered it!

I appreciate your patience with me!

feat: add agent-friendly Modly CLI

97b75d4

jokrasno added 3 commits May 29, 2026 12:28

fix: align Modly CLI automation contract

d351e95

fix: support direct workflow asset outputs

a8b0ceb

fix: resolve gitignore conflict

019b044

fix: harden Modly CLI agent safety

9c92bca

fix: validate workflow scene workspace paths

7f85bcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add agent-friendly Modly CLI#144

feat: add agent-friendly Modly CLI#144
jokrasno wants to merge 6 commits into
lightningpixel:devfrom
jokrasno:cli-tool-dev-pr

jokrasno commented May 12, 2026

Uh oh!

DrHepa commented May 22, 2026 •

edited

Loading

Uh oh!

jokrasno commented May 29, 2026

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

lightningpixel commented Jun 2, 2026 •

edited

Loading

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

lightningpixel commented Jun 2, 2026

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

jokrasno commented May 12, 2026

Uh oh!

DrHepa commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

1. Prefer workflow runs for generation

2. Use process/capability discovery instead of heuristics

3. Validate model IDs canonically

4. Clarify readiness boundaries

5. Keep ComfyUI out of the canonical Modly agent contract

6. Align with existing MCP direction

7. PR hygiene

Uh oh!

jokrasno commented May 29, 2026

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

lightningpixel commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

lightningpixel commented Jun 2, 2026

Uh oh!

jokrasno commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DrHepa commented May 22, 2026 •

edited

Loading

lightningpixel commented Jun 2, 2026 •

edited

Loading