Add daemon process watchdog with crash rate-limiting by yunsmall · Pull Request #770 · JingMatrix/Vector

yunsmall · 2026-06-23T19:14:13Z

When the Vector daemon process crashes, all existing Binder connections
are permanently severed:

The manager app calls System.exit(0) in its DeathRecipient.
Injected app processes clear their VectorServiceClient service
reference and never attempt to reconnect.
Newly forked apps fail to obtain the IPC binder in postAppSpecialize
and skip Xposed injection entirely.
Only system_server can recover, because VectorDaemon re-injects
into it on startup.

However, all persistent state (module configs, scopes, preferences, logs)
lives on disk and survives the crash. If the daemon is restarted, new
processes can once again get a fresh binder and be injected — so an
automatic restart still restores meaningful functionality.

Changes

`zygisk/module/service.sh`

Wrap the unshare launch in a rate-limited watchdog loop:

If the daemon crashes more than 3 times within 60 seconds, the
watchdog enters a 3-minute cooldown before retrying, then resets
the counter. This prevents tight crash-loops from burning CPU.
Normal (infrequent) crashes use a 5-second restart delay.
The entire loop runs in the background (done &), so Magisk's
late_start service handler returns immediately and is never blocked.
Each iteration calls unshare -m, creating a fresh mount namespace;
stale mounts from a previous crashed daemon do not accumulate.

`zygisk/module/daemon`

Drop the exec keyword from the app_process invocation. Without
exec, the app_process process runs as a child of the daemon script.
When it exits, control returns to the shell script, which finishes,
allowing the watchdog loop in service.sh to iterate. (Previously
exec replaced the shell process, so when the daemon died there was
nothing left to restart it.)

Compatibility

No effect on the daemon's normal operation path. When the daemon does
not crash, the watchdog loop is parked at unshare waiting for it to
exit — identical to the original behavior.

When the daemon crashes, all existing app connections are permanently lost (Binder death triggers service reference clears and manager suicide). However persistent state lives on disk and new processes can still get a fresh binder — so restarting the daemon still recovers meaningful functionality. - service.sh: wrap the unshare launch in a rate-limited watchdog loop. If the daemon crashes more than 3 times within 60 seconds, the watchdog enters a 3-minute cooldown before retrying and resets the counter. Normal restarts use a 5-second delay. The loop runs in background (&) so Magisk late_start is never blocked. - daemon: drop 'exec' so that when app_process exits, control returns to the daemon script, which finishes, allowing the watchdog in service.sh to iterate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add daemon process watchdog with crash rate-limiting#770

Add daemon process watchdog with crash rate-limiting#770
yunsmall wants to merge 1 commit into
JingMatrix:masterfrom
yunsmall:daemon-watchdog-restart

yunsmall commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yunsmall commented Jun 23, 2026

Changes

zygisk/module/service.sh

zygisk/module/daemon

Compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`zygisk/module/service.sh`

`zygisk/module/daemon`