Skip to content

Future: stricter plugin isolation (Terraform-style subprocess+RPC model) #1

Description

@kreneskyp

Context

In the StR-005/006 shared config + secrets design (FR-010..019, NFR-003..006, branch feat/shared-config-secrets), we deliberately accepted a soft contract for cross-plugin config/secret isolation:

  • Per-plugin file isolation in ~/.config/ix/config.d/<plugin-id>.yaml and ~/.config/ix/secrets.d/<plugin-id>.age.
  • Per-plugin Zod schemas, atomic writes, advisory locks.
  • String-id API (ConfigService.forPlugin(id, schema)) — not runtime-enforced; backed by static-check lint and naming convention.

This matches the threat model we explicitly chose: defend against accidental corruption from buggy plugins, not adversarial exfiltration. Plugins run in-process with full Node.js privileges (node:fs, env vars, process.binding), so any in-process API check is bypassable in two lines of code.

This is the same posture as gh, kubectl, aws-cli, oclif, helm — string namespacing + trust. The one major dev CLI that does enforce real isolation is Terraform (subprocess-per-provider over gRPC).

Why a future issue

If we ever want adversarial isolation — for example to support a marketplace of third-party plugins where users install plugins they don't fully audit — we have to move to a Terraform-style model:

  • Each plugin runs as a subprocess.
  • Communication with apps/ix over a typed RPC channel (gRPC, JSON-RPC, or MessagePack-RPC).
  • The RPC surface defines exactly what plugins can do (run command X, read config Y, request secret Z); arbitrary fs/network access is severed.
  • Secrets are minted per-call by the parent and never crossed into the plugin's env.

That is a substantial architectural change — it touches plugin discovery, plugin SDK, command dispatch, error/streaming, and the UX contract (output piping, prompts). It's a v2+/v3 conversation, not a refinement of the current shared-config work.

Scope (when picked up)

  • ADR comparing in-process vs subprocess plugin models, with cost/benefit.
  • Define the RPC surface: command invocation, config read/write, secret fetch, UI streaming (this last one is the hard part — @agent-ix/ix-ui-cli framing is currently in-process).
  • Decide on transport (gRPC vs JSON-RPC over stdio vs Unix socket).
  • Plugin SDK in TypeScript that exposes the RPC surface as ergonomic local APIs.
  • Migration path for the in-tree plugins (local, elements, spec) — most likely they stay in-process as "first-party" while third-party plugins are sandboxed.
  • Security analysis: what can a malicious plugin do via the RPC surface (capability minimization).
  • Performance budget: subprocess startup overhead per command invocation.

Out of scope for this issue

  • Sandboxing in-process plugins (Node.js vm module, isolated-vm, worker threads with disabled fs) — these don't actually contain a determined plugin and have been shown insecure repeatedly. Real isolation needs OS-level process boundaries.
  • Replacing oclif. The plugin SDK can sit on top of oclif for the in-process portion; subprocess plugins won't go through oclif at all.

Design references

  • HashiCorp go-plugin — the actual Terraform provider plugin protocol.
  • VS Code's extension host — different model (single subprocess for many extensions); useful as a counter-example.
  • Buildkite Agent hooks — minimal RPC-like contract over env vars + stdio.

Related

  • Branch feat/shared-config-secrets (PR pending).
  • StR-005, StR-006, FR-010..019, NFR-003..006 in spec/.
  • spec.md §10 trust-model disclaimer (added in the same branch).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions