Skip to content

migrate live extrachill.com off RUN_AS_ROOT to the opencode service user (durable fix for #93 + #198 root-ownership cascade) #199

@chubes4

Description

@chubes4

Summary

This is the concrete, live-VPS follow-through for #93 (and the durable fix that makes #198 moot): migrate the running extrachill.com install off RUN_AS_ROOT (kimaki.service User=root) to the dedicated opencode service user that is already provisioned and already a member of the www-data group.

Running the agent runtime as root is the root cause of two distinct, recurring breakages:

Both vanish if the runtime runs as opencode (group www-data) with state in a www-data-traversable location. The installer already defines this path (lib/detect.sh else-branch) but this box was installed with RUN_AS_ROOT=true and never migrated.

Current live state (extrachill.com, verified)

The opencode user already exists and is correctly grouped — the migration is half-done and inconsistent:

$ id opencode
uid=1000(opencode) gid=1000(opencode) groups=1000(opencode),4(adm),33(www-data)   # already in www-data ✓

$ grep -E '^User=|HOME=|--data-dir' /etc/systemd/system/kimaki.service
User=root                                  # <-- still root (the trap)
Environment=HOME=/home/opencode            # <-- but HOME points at opencode (!)
ExecStart=/root/.kimaki/bin/kimaki --data-dir /root/.kimaki ...   # <-- and data-dir is /root

$ ls -ld /home/opencode /home/opencode/.kimaki
drwxr-x--- opencode opencode  /home/opencode            # 0750
drwxr-xr-x opencode opencode  /home/opencode/.kimaki    # already exists, opencode-owned

So the unit is a Frankenstein: runs as root, HOME says opencode, data-dir says /root. The opencode user, its home, and /home/opencode/.kimaki are all already provisioned — only User= and the --data-dir/binary path are wrong.

Live cascade damage from running as root (#93 evidence)

$ find wp-content     -user root | wc -l    # 191
$ find wp-content/uploads -user root | wc -l    # 147
$ find wp-admin wp-includes -user root | wc -l    #   0   (core is clean — good)

Sample root-owned files (must be re-chowned as part of migration):

wp-content/mu-plugins/wp-coding-agents-channels.php
wp-content/mu-plugins/wp-coding-agents-runtimes.php
wp-content/uploads/sites/7/datamachine-files/pipeline-32/flow-732
wp-content/uploads/2026/05/IMG_1655-scaled-1-1196x1536.jpg
wp-content/uploads/sites/12/backup/...

Critical gotcha the migration MUST handle (else it just re-creates #198)

/home/opencode is mode 0750, so www-data cannot traverse it:

$ sudo -u www-data test -x /home/opencode && echo ok || echo BLOCKED
BLOCKED
$ sudo -u www-data test -r /home/opencode/.kimaki/bin/kimaki && echo ok || echo BLOCKED
BLOCKED

This means a naive User=root → User=opencode flip that registers /home/opencode/.kimaki/bin/kimaki as the CLI-dispatch command would reproduce exactly the #198 spawn failure at a new pathwww-data still can't traverse /home/opencode (0750). The migration is only complete if the dispatch command resolves to a system-PATH binary every relevant user can reach (/usr/bin/kimaki, the npm-global symlink — already present on this host and confirmed www-data-executable in #198).

This is why #198 option (a) and this migration are complementary, not duplicate: (a) ensures the registered command is system-reachable regardless of service user; this migration removes the root-write cascade. Land (a) first, then this.

Proposed migration steps (live VPS)

  1. Pre-flight: confirm opencodewww-data (done), /home/opencode + /home/opencode/.kimaki exist (done). Stop kimaki.service.
  2. Rewrite the unit to the canonical non-root template the installer already knows how to emit:
  3. Re-register the CLI channel so the dispatch command is /usr/bin/kimaki (the fix: agent CLI binary registered under /root is unreachable by www-data WP-cron, causing recurring datamachine_code_cli_dispatch_spawn_failed #198 (a) resolver fix; verify the mu-plugin block updates).
  4. Remediate existing root-owned files (one-time):
  5. Confirm wp no longer needs --allow-root in the agent context, and that data-machine-code's datamachine_code_resolve_wp_cli_cmd() stops baking --allow-root into AGENTS.md (it keys on posix_geteuid() === 0). Running as opencode makes that detection correctly drop --allow-root, ending the loop described in RUN_AS_ROOT path leads to root-owned files cascading through site (cause: kimaki.service runs as root, every wp-cli inherits root, every file written is root-owned) #93 step 2.
  6. Verify:

Installer-side durability (so this doesn't regress)

The live remediation above is one-time, but the installer should stop offering/defaulting to RUN_AS_ROOT on VPS, or at minimum:

Why NOT "make root safe instead"

Keeping User=root and trying to make it WordPress-safe requires chmod 0701 /root (defeats /root privacy), UMask=0002 everywhere, and forcing wp/agent writes to drop to www-data — strictly worse than running as the opencode user that is already built and already group-aligned. The platform's own design intent (the lib/detect.sh default branch) is a dedicated non-root service user. This issue just makes the live box match that intent.

Cross-references

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions