You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the concrete, live-VPS follow-through for #93 (and the durable fix that makes #198 moot): migrate the running extrachill.com install off RUN_AS_ROOT (kimaki.serviceUser=root) to the dedicated opencode service user that is already provisioned and already a member of the www-data group.
Running the agent runtime as root is the root cause of two distinct, recurring breakages:
Both vanish if the runtime runs as opencode (group www-data) with state in a www-data-traversable location. The installer already defines this path (lib/detect.sh else-branch) but this box was installed with RUN_AS_ROOT=true and never migrated.
Current live state (extrachill.com, verified)
The opencode user already exists and is correctly grouped — the migration is half-done and inconsistent:
$ id opencode
uid=1000(opencode) gid=1000(opencode) groups=1000(opencode),4(adm),33(www-data) # already in www-data ✓
$ grep -E '^User=|HOME=|--data-dir' /etc/systemd/system/kimaki.service
User=root # <-- still root (the trap)
Environment=HOME=/home/opencode # <-- but HOME points at opencode (!)
ExecStart=/root/.kimaki/bin/kimaki --data-dir /root/.kimaki ... # <-- and data-dir is /root
$ ls -ld /home/opencode /home/opencode/.kimaki
drwxr-x--- opencode opencode /home/opencode # 0750
drwxr-xr-x opencode opencode /home/opencode/.kimaki # already exists, opencode-owned
So the unit is a Frankenstein: runs as root, HOME says opencode, data-dir says /root. The opencode user, its home, and /home/opencode/.kimaki are all already provisioned — only User= and the --data-dir/binary path are wrong.
Live cascade damage from running as root (#93 evidence)
Critical gotcha the migration MUST handle (else it just re-creates #198)
/home/opencode is mode 0750, so www-datacannot traverse it:
$ sudo -u www-data test -x /home/opencode && echo ok || echo BLOCKED
BLOCKED
$ sudo -u www-data test -r /home/opencode/.kimaki/bin/kimaki && echo ok || echo BLOCKED
BLOCKED
This means a naive User=root → User=opencode flip that registers /home/opencode/.kimaki/bin/kimaki as the CLI-dispatch command would reproduce exactly the #198 spawn failure at a new path — www-data still can't traverse /home/opencode (0750). The migration is only complete if the dispatch command resolves to a system-PATH binary every relevant user can reach (/usr/bin/kimaki, the npm-global symlink — already present on this host and confirmed www-data-executable in #198).
This is why #198 option (a) and this migration are complementary, not duplicate: (a) ensures the registered command is system-reachable regardless of service user; this migration removes the root-write cascade. Land (a) first, then this.
sudo -u www-data /usr/bin/kimaki --version works (already confirmed).
New agent file writes land opencode:opencode or www-data-group-writable, not root:root.
A subsequent wp plugin update / WP auto-update no longer hits copy_failed_*.
Installer-side durability (so this doesn't regress)
The live remediation above is one-time, but the installer should stop offering/defaulting to RUN_AS_ROOT on VPS, or at minimum:
lib/detect.sh: make the opencode non-root branch the VPS default; treat RUN_AS_ROOT=true as an explicit opt-in footgun with a loud warning (it currently triggers whenever the elif is reached).
lib/infrastructure.sh already does chown -R www-data:www-data $SITE_PATH and useradd -G www-data opencode for the non-root path — that's the correct durable baseline; the bug is only that this VPS took the root branch.
Why NOT "make root safe instead"
Keeping User=root and trying to make it WordPress-safe requires chmod 0701 /root (defeats /root privacy), UMask=0002 everywhere, and forcing wp/agent writes to drop to www-data — strictly worse than running as the opencode user that is already built and already group-aligned. The platform's own design intent (the lib/detect.sh default branch) is a dedicated non-root service user. This issue just makes the live box match that intent.
Summary
This is the concrete, live-VPS follow-through for #93 (and the durable fix that makes #198 moot): migrate the running extrachill.com install off
RUN_AS_ROOT(kimaki.serviceUser=root) to the dedicatedopencodeservice user that is already provisioned and already a member of thewww-datagroup.Running the agent runtime as
rootis the root cause of two distinct, recurring breakages:root:root, sowww-dataWordPress auto-update fails withcopy_failed_*./root(0700), unreachable by thewww-dataWP-cron dispatch path, so scheduledagents/dispatch-messageflows fail withdatamachine_code_cli_dispatch_spawn_failed.Both vanish if the runtime runs as
opencode(groupwww-data) with state in awww-data-traversable location. The installer already defines this path (lib/detect.shelse-branch) but this box was installed withRUN_AS_ROOT=trueand never migrated.Current live state (extrachill.com, verified)
The
opencodeuser already exists and is correctly grouped — the migration is half-done and inconsistent:So the unit is a Frankenstein: runs as root, HOME says opencode, data-dir says /root. The
opencodeuser, its home, and/home/opencode/.kimakiare all already provisioned — onlyUser=and the--data-dir/binary path are wrong.Live cascade damage from running as root (#93 evidence)
Sample root-owned files (must be re-chowned as part of migration):
Critical gotcha the migration MUST handle (else it just re-creates #198)
/home/opencodeis mode0750, sowww-datacannot traverse it:This means a naive
User=root → User=opencodeflip that registers/home/opencode/.kimaki/bin/kimakias the CLI-dispatchcommandwould reproduce exactly the #198 spawn failure at a new path —www-datastill can't traverse/home/opencode(0750). The migration is only complete if the dispatchcommandresolves to a system-PATH binary every relevant user can reach (/usr/bin/kimaki, the npm-global symlink — already present on this host and confirmedwww-data-executable in #198).This is why #198 option (a) and this migration are complementary, not duplicate: (a) ensures the registered command is system-reachable regardless of service user; this migration removes the root-write cascade. Land (a) first, then this.
Proposed migration steps (live VPS)
opencode∈www-data(done),/home/opencode+/home/opencode/.kimakiexist (done). Stopkimaki.service.User=opencodeEnvironment=HOME=/home/opencode--data-dir /home/opencode/.kimakiExecStartbinary →/usr/bin/kimaki(orwhich kimakiresolved to the system prefix, not$SERVICE_HOME/.kimaki/bin)UMask=0002so opencode's writes are group-writable bywww-data(ties in setup: enforce UMask=0002 on php-fpm + kimaki services so coding-agent writes stay group-writable (WP auto-update keeps failing on stamped 0644 files) #125).commandis/usr/bin/kimaki(the fix: agent CLI binary registered under /root is unreachable by www-data WP-cron, causing recurring datamachine_code_cli_dispatch_spawn_failed #198 (a) resolver fix; verify the mu-plugin block updates).chown -R www-data:www-data wp-content/uploads(and the 191 wp-content + 147 uploads root-owned files), or more surgicallyfind wp-content -user root -exec chown www-data:www-data {} +.wp-coding-agents-*.phpmu-plugins end upwww-data-readable (they are, at 0644 per fix(lib): mu-plugins written by lib/cli-channel.sh and lib/runtime-signature.sh land as 0600, breaking PHP-FPM read #133, but should be re-chowned for hygiene).wpno longer needs--allow-rootin the agent context, and thatdata-machine-code'sdatamachine_code_resolve_wp_cli_cmd()stops baking--allow-rootinto AGENTS.md (it keys onposix_geteuid() === 0). Running asopencodemakes that detection correctly drop--allow-root, ending the loop described in RUN_AS_ROOT path leads to root-owned files cascading through site (cause: kimaki.service runs as root, every wp-cli inherits root, every file written is root-owned) #93 step 2.systemctl status kimaki→ active asopencode.EC Agent Progress Ping) AND wait for a scheduled fire — both shouldcomplete(previously only manual-as-root worked, see fix: agent CLI binary registered under /root is unreachable by www-data WP-cron, causing recurring datamachine_code_cli_dispatch_spawn_failed #198).sudo -u www-data /usr/bin/kimaki --versionworks (already confirmed).opencode:opencodeorwww-data-group-writable, notroot:root.wp plugin update/ WP auto-update no longer hitscopy_failed_*.Installer-side durability (so this doesn't regress)
The live remediation above is one-time, but the installer should stop offering/defaulting to
RUN_AS_ROOTon VPS, or at minimum:lib/detect.sh: make theopencodenon-root branch the VPS default; treatRUN_AS_ROOT=trueas an explicit opt-in footgun with a loud warning (it currently triggers whenever the elif is reached)._kimaki_find_native_binary/_kimaki_register_cli_channel(fix: agent CLI binary registered under /root is unreachable by www-data WP-cron, causing recurring datamachine_code_cli_dispatch_spawn_failed #198 (a)) never register a path under a0700/0750home — pick a system-prefix binary.UMask=0002on the kimaki + php-fpm units (setup: enforce UMask=0002 on php-fpm + kimaki services so coding-agent writes stay group-writable (WP auto-update keeps failing on stamped 0644 files) #125).lib/infrastructure.shalready doeschown -R www-data:www-data $SITE_PATHanduseradd -G www-data opencodefor the non-root path — that's the correct durable baseline; the bug is only that this VPS took the root branch.Why NOT "make root safe instead"
Keeping
User=rootand trying to make it WordPress-safe requireschmod 0701 /root(defeats /root privacy),UMask=0002everywhere, and forcing wp/agent writes to drop towww-data— strictly worse than running as theopencodeuser that is already built and already group-aligned. The platform's own design intent (thelib/detect.shdefault branch) is a dedicated non-root service user. This issue just makes the live box match that intent.Cross-references
UMask=0002on php-fpm + kimaki services (folded into step 2/installer durability).