Skip to content

fix: register a web-traversable kimaki binary for CLI dispatch (#198)#200

Merged
chubes4 merged 1 commit into
mainfrom
fix-198-cli-binary-path
Jun 8, 2026
Merged

fix: register a web-traversable kimaki binary for CLI dispatch (#198)#200
chubes4 merged 1 commit into
mainfrom
fix-198-cli-binary-path

Conversation

@chubes4

@chubes4 chubes4 commented Jun 8, 2026

Copy link
Copy Markdown
Member

Summary

Fixes #198. On a RUN_AS_ROOT install, the kimaki CLI-channel command was registered as /root/.kimaki/bin/kimaki. That command is shelled by the Data Machine Code CLI transport from agents/dispatch-message, which runs inside PHP-FPM as www-data on WP-cron / Action Scheduler fires — not as the kimaki.service user. www-data cannot traverse 0700 /root, so proc_open fails with EACCES and every scheduled dispatch dies as datamachine_code_cli_dispatch_spawn_failed, while manual-as-root runs succeed. The opencode service-user home (/home/opencode, 0750) is the same trap.

This patch makes the registered command web-traversable regardless of service user, by refusing any binary whose ancestor directories lack the world-execute bit (o+x) and preferring a reachable system-prefix binary (e.g. /usr/bin/kimaki, the npm-global symlink).

Changes

  • bridges/kimaki.sh
    • New _kimaki_path_is_web_traversable() — returns success only if every ancestor dir of a candidate binary carries o+x (resolves symlinks first via realpath/readlink).
    • _kimaki_register_cli_channel() now ignores a KIMAKI_BIN that is executable but trapped under a non-traversable home, falling back to PATH resolution.
    • _kimaki_find_native_binary() now skips executable-but-unreachable $PATH entries in favor of a later reachable candidate; only falls back to a trapped path (then the bare name) as a last resort.
  • tests/cli-channel-binary-path.sh (new) + CI job — covers 0700/0750 ancestor rejection, reachable-preference on a root-style $PATH, trapped-KIMAKI_BIN fallback, and a no-regression check that a reachable KIMAKI_BIN is still honored.
  • tests/kimaki-agent-fallback.shmktemp -d defaults to 0700; its fixtures now chmod 0755 the temp root to simulate a normally-installed, web-reachable binary (otherwise the new, correct reachability check would reject them).

Verification against the live production scenario

With root's actual $PATH ordering (/root/.kimaki/bin first), the fixed resolver now returns the reachable binary:

$ PATH="/root/.kimaki/bin:/usr/bin:/bin" _kimaki_find_native_binary
/usr/bin/kimaki                      # was /root/.kimaki/bin/kimaki

$ _kimaki_path_is_web_traversable /root/.kimaki/bin/kimaki  ; echo $?
1                                    # correctly rejected (/root is 0700)
$ _kimaki_path_is_web_traversable /usr/bin/kimaki           ; echo $?
0                                    # accepted (/usr/bin is 0755)

/usr/bin/kimaki is confirmed www-data-executable on the affected host, so this un-breaks the scheduled EC Agent Progress Ping flow (flow_id 4).

Test results

tests/cli-channel-binary-path.sh   OK: all assertions passed (7)
tests/kimaki-agent-fallback.sh     OK
tests/cli-channel-perms.sh         OK
bash -n on all changed scripts     OK

tests/bridge-render.sh shows a pre-existing, host-dependent launchd-PATH snapshot drift that reproduces on clean main (unrelated to this change) — not touched here.

Scope / relationship to the durable fix

This is the targeted stopgap. The durable fix is migrating off RUN_AS_ROOT to the opencode service user (#93, live remediation scoped in #199). Because /home/opencode (0750) is also non-traversable by www-data, making the registered command web-reachable regardless of service user is a prerequisite for that migration, not a duplicate of it.

Closes #198.

The kimaki CLI-channel command is shelled by the Data Machine Code CLI
transport from agents/dispatch-message, which runs inside PHP-FPM as the
www-data web user on WP-cron / Action Scheduler fires — not as the
kimaki.service user. On a RUN_AS_ROOT install the binary resolves under
/root/.kimaki/bin and the data dir under /root (mode 0700). www-data
cannot traverse 0700 /root, so proc_open fails with EACCES and every
scheduled dispatch dies as datamachine_code_cli_dispatch_spawn_failed,
while manual-as-root runs succeed. The opencode service-user home
(/home/opencode, 0750) is the same trap.

Add _kimaki_path_is_web_traversable() which asserts every ancestor dir
of a candidate binary carries the world-execute bit (o+x), and use it in
both resolution paths: _kimaki_register_cli_channel now ignores a
KIMAKI_BIN trapped under a private home, and _kimaki_find_native_binary
skips executable-but-unreachable PATH entries in favor of a reachable
system-prefix binary (e.g. /usr/bin/kimaki, the npm-global symlink).

This is the targeted stopgap; the durable fix is migrating off
RUN_AS_ROOT to the opencode service user (#93). Making the registered
command web-reachable regardless of service user is a prerequisite for
that migration, since /home/opencode is also non-traversable by www-data.

Adds tests/cli-channel-binary-path.sh (+ CI job) covering the 0700/0750
rejection, reachable-preference, and trapped-KIMAKI_BIN fallback. Updates
tests/kimaki-agent-fallback.sh fixtures (mktemp -d defaults to 0700) to
simulate web-reachable installs.
@chubes4 chubes4 merged commit b55a5b3 into main Jun 8, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: agent CLI binary registered under /root is unreachable by www-data WP-cron, causing recurring datamachine_code_cli_dispatch_spawn_failed

1 participant