Skip to content

Wizard: proactively surface stale pre-MAL-57 VHID daemon plist with one-click repair#904

Merged
malpern merged 2 commits into
masterfrom
claude/sweet-bhabha-b62002
Jun 11, 2026
Merged

Wizard: proactively surface stale pre-MAL-57 VHID daemon plist with one-click repair#904
malpern merged 2 commits into
masterfrom
claude/sweet-bhabha-b62002

Conversation

@malpern

@malpern malpern commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Summary

ServiceHealthChecker.isVHIDDaemonConfiguredCorrectly() (ProgramArguments path + ProcessType=Interactive, the MAL-57 starvation fix) was only consumed by the repair postflight, where it trivially passes because repair just rewrote the plist. Old installs with a pre-fix plist never migrated unless repair ran for some other reason, leaving them vulnerable to the MAL-57 stuck-key autorepeat under CPU load.

This wires plist-content validation into the wizard's health model:

  • ServiceHealthChecker.isVHIDDaemonPlistPresentButMisconfigured() — single-read, presence-gated content check (shared with isVHIDDaemonConfiguredCorrectly via one content validator using PlistGenerator.vhidDaemonPath). A missing plist stays attributed to service health ("not installed"); an existing-but-stale or unparseable plist counts as misconfigured.
  • ComponentStatus.vhidDaemonPlistMisconfigured — new fact, populated by SystemValidator inside the 15s-TTL component-facts cache (no per-cycle main-thread plist I/O; invalidateCaches() refreshes it after repairs). Deliberately excluded from hasAllRequired (documented): a stale plist still runs day-to-day.
  • SystemInspector emits the previously dormant .component(.vhidDaemonMisconfigured) issue — severity .warning (nudges migration without flipping the app to a failed state; wizard routing and the fix button are identifier-based and severity-agnostic), gated on otherwise-healthy services so it doesn't double-alarm with "VHID Services Unhealthy". Auto-fix: .installRequiredRuntimeServices. Existing consumers (KarabinerComponentsStatusEvaluator, WizardRouter, components-page repair) light up unchanged.
  • ActionDeterminer repair/install plans include the rewrite via a new shared ComponentStatus.vhidRuntimeServicesNeedRepair, so the trigger can't drift between repair and install.
  • Stale-helper guard (review finding, confirmed): a resident pre-fix helper rewrites the old plist and reports success, which would loop the fix forever. PrivilegedOperationsRouter.installRequiredRuntimeServices() now verifies the plist after the helper path and falls back to the in-app install (current PlistGenerator) so the repair converges.

Review gate findings addressed; notably severity was downgraded from .error after verifying MainAppStateController counts .installation+.error issues as blocking (would have shown an app-wide red/failed state for working installs).

Test plan

  • New ServiceHealthCheckerTests: misconfigured check for missing / correct / no-ProcessType / wrong-path / corrupt plists (hermetic via KEYPATH_LAUNCH_DAEMONS_DIR); fixtures now default to PlistGenerator.vhidDaemonPath; singleton health cache cleared in setUp (fixes a cross-suite flake).
  • New WizardPureLogicTests: inspector emits the issue (.installation, .warning, auto-fix, state stays .active); suppressed while services are unhealthy (single alarm, same fix); ActionDeterminer includes the rewrite when only the plist is stale.
  • Full safe gate: 4,378 tests pass.

Known coverage gap: the stale-helper guard in PrivilegedOperationsRouter.installRequiredRuntimeServices() (helper reports success → plist still stale → in-app sudo rewrite) has no unit test — it requires mocking a succeeding HelperManager alongside the injected ServiceHealthChecker. The pure-logic detection, issue gating, and action triggers around it are covered; cache freshness after the fallback is handled by MainAppStateController.invalidateValidationCooldown() on wizard close (15s TTL worst case mid-wizard).

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code Review — PR #904

Overall this is a well-scoped, well-reasoned PR. The MAL-57 migration path is carefully gated (severity, cache placement, double-alarm suppression) and the stale-helper guard closes a real loop that would otherwise repeat forever. A few findings below.


Bug: Foundation.FileManager() should be FileManager.default

ServiceHealthChecker.swiftisVHIDDaemonPlistPresentButMisconfigured()

return Foundation.FileManager().fileExists(atPath: plistPath)

This allocates a new FileManager instance instead of using the shared FileManager.default. The behavior is the same, but it's inconsistent with the rest of the codebase and wastes an allocation on every call to the fallback branch. Should be:

return FileManager.default.fileExists(atPath: plistPath)

Missing test: determineInstallActions with misconfigured plist

ActionDeterminer line 117 was also switched from !context.components.vhidServicesHealthy to context.components.vhidRuntimeServicesNeedRepair, but the only new ActionDeterminer test covers the repair path. The install path deserves a parallel test:

func test_determineInstallActions_vhidDaemonPlistMisconfigured_includesInstallRuntimeServices() {
    let components = ComponentStatus(
        kanataBinaryInstalled: true,
        karabinerDriverInstalled: true,
        karabinerDaemonRunning: true,
        vhidDeviceInstalled: true,
        vhidDeviceHealthy: true,
        vhidServicesHealthy: true,
        vhidDaemonPlistMisconfigured: true,
        vhidVersionMismatch: false
    )
    let context = makeContext(components: components)
    let actions = ActionDeterminer.determineInstallActions(context: context)
    XCTAssertTrue(actions.contains(.installRequiredRuntimeServices))
}

The repair path and install path changing together without a matching install-path test is the exact drift scenario vhidRuntimeServicesNeedRepair was introduced to prevent.


Nit: single-line do-catch compression

The reformatted do { ... } catch { ... } single-liners in PrivilegedOperationsRouter (e.g. installNewsyslogConfig, killAllKanataProcesses, etc.) are harder to read than the original two-line form, especially for the multi-clause catch is CancellationError case. Readability will suffer when these need future editing. Not a blocker, but the multi-line form was better.


Positive notes

  • vhidRuntimeServicesNeedRepair is the right abstraction — a single computed property that both the repair and install paths reference guarantees they stay in sync.
  • Severity reasoning is solid: .warning + identity-based routing means the wizard repair button works without turning the app red for a functional-but-stale install. The comment in SystemInspector documenting why .error was rejected is worth keeping.
  • Cache placement: putting the plist read inside ComponentInstallationFacts (15s TTL, invalidateCaches() post-repair) is exactly right — plist I/O doesn't land on the validation hot-path.
  • Stale-helper guard: falling back to the in-app sudoInstallRequiredRuntimeServices() after detecting a still-misconfigured plist is a clean convergence guarantee. The comment is also clear about why this can happen.
  • Test isolation fix: checker.invalidateHealthCache() in setUp is a good catch — cross-suite cache pollution with different launch-daemons dirs is exactly the kind of flake that's hard to debug later.
  • Test coverage: the five isVHIDDaemonPlistPresentButMisconfigured cases (missing, correct, no ProcessType, wrong path, corrupt) cover the full decision surface.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca4c472c53

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

AppLogger.shared.warn(
"⚠️ [PrivilegedOperationsRouter] VHID daemon plist still misconfigured after helper install — stale helper template, retrying via in-app install"
)
try await sudoInstallRequiredRuntimeServices()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Bypass the stale helper for the retry

When this guard detects a resident pre-MAL-57 helper, sudoInstallRequiredRuntimeServices() still does not force the in-app plist generator: it calls ServiceBootstrapper.installAllServices(), whose first step is InstallerEngine().runSingleAction(.repairVHIDDaemonServices, using: PrivilegeBroker()), and that routes back through PrivilegedOperationsRouter.repairVHIDDaemonServices() to the same responsive helper. In the exact stale-helper scenario described here, the retry can therefore rewrite the old plist again and return success, leaving ProcessType=Interactive missing and the repair loop unresolved until the helper is restarted or bypassed explicitly.

Useful? React with 👍 / 👎.

@malpern malpern force-pushed the claude/sweet-bhabha-b62002 branch from ca4c472 to 04de9f5 Compare June 11, 2026 18:49
@malpern

malpern commented Jun 11, 2026

Copy link
Copy Markdown
Owner Author

Review addressed (picked up from a parallel session since this PR had stalled at BEHIND):

  • Install-path test added: test_determineInstallActions_vhidDaemonPlistMisconfigured_includesInstallRuntimeServices mirrors the repair-path test, guarding the exact drift vhidRuntimeServicesNeedRepair exists to prevent.
  • FileManager.default used in the new isVHIDDaemonPlistPresentButMisconfigured(). One note for the record: the rest of ServiceHealthChecker consistently uses Foundation.FileManager() (lines 197, 253, 327, 873), so the 'inconsistent with the codebase' premise was inverted — but .default is the better API and the right direction for new code.
  • do-catch nit: left as-is; that formatting came from the pinned swiftformat config, so hand-reverting would churn on the next format pass.

Rebased onto master (now includes #897/#898/#902). WizardPureLogicTests + ServiceHealthCheckerTests green locally (117 tests).

@malpern malpern enabled auto-merge June 11, 2026 18:49
@malpern malpern force-pushed the claude/sweet-bhabha-b62002 branch from 04de9f5 to db59240 Compare June 11, 2026 18:50
@malpern

malpern commented Jun 11, 2026

Copy link
Copy Markdown
Owner Author

Review findings addressed in the updated commit:

  • Codex P1 (stale-helper retry loops back through the helper) — confirmed and fixed. The guard now calls sudoRepairVHIDServices() (ServiceBootstrapper + current PlistGenerator, with its own configured-postflight) instead of sudoInstallRequiredRuntimeServices(), which would have routed back through InstallerEngine → PrivilegeBroker → this router's helper-first repairVHIDDaemonServices. The comment in the code documents the trap.
  • Missing determineInstallActions test — added (test_determineInstallActions_vhidDaemonPlistMisconfigured_includesInstallRuntimeServices).
  • Single-line do-catch compression — reverted; the router keeps master's two-line form (that was lint-autofix churn, not intentional).
  • Foundation.FileManager() vs FileManager.default — kept as is: ServiceHealthChecker.swift consistently uses Foundation.FileManager() (see the test-mode plist-existence checks at ~197/253/327), so the file-local convention is the instance form.

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code Review — PR #904: Proactive MAL-57 plist migration

Overall this is a well-thought-out change that fills a real gap (existing installs that never hit a repair never migrated). The design decisions are sound and the test coverage is thorough. A few things worth examining before merge:


What's working well

DRY extraction of vhidDaemonPlistContentIsValid (ServiceHealthChecker.swift) — sharing the single content check between isVHIDDaemonConfiguredCorrectly and isVHIDDaemonPlistPresentButMisconfigured via a private static method is exactly right. Previously the path string was duplicated; now PlistGenerator.vhidDaemonPath is the single source.

vhidRuntimeServicesNeedRepair computed property (SystemSnapshot.swift) — unifying !vhidServicesHealthy || vhidDaemonPlistMisconfigured into one named property means repair and install plans can't drift independently. Clean use of the Single Source of Truth pattern called out in CLAUDE.md.

Stale-helper guard design (PrivilegedOperationsRouter.swift:107-115) — the reasoning is solid: a pre-fix helper binary can report success while silently writing the old plist. Verifying the result and falling back to the in-app installer is the right convergence strategy.

Severity .warning vs. .error — the PR description explains why .error would flip the app to a blocked state for healthy installs. The comment in SystemInspector.swift documents this constraint inline, which is appropriate for a non-obvious constraint.

Presence-gating in isVHIDDaemonPlistPresentButMisconfigured — returning false for a missing plist correctly delegates that case to service health. The test for the corrupt-plist case (testVHIDPlistMisconfiguredReturnsTrueForCorruptPlist) covers the important edge case where the file exists but NSDictionary(contentsOfFile:) returns nil.

invalidateHealthCache() in setUp — this fixes a real cross-suite flake where the singleton's TTL cache carries state from other tests using a different launch daemons dir.


Concerns

1. Double filesystem op for corrupt plists (ServiceHealthChecker.swift:912-916)

guard let dict = NSDictionary(contentsOfFile: plistPath) as? [String: Any] else {
    return FileManager.default.fileExists(atPath: plistPath)
}

When the plist exists but is corrupt, NSDictionary(contentsOfFile:) reads the file and fails, then fileExists does a second syscall to check presence. Since this runs inside the 15s component-facts cache it's not a hot path, but the logic can be clarified without the extra call:

guard let dict = NSDictionary(contentsOfFile: plistPath) as? [String: Any] else {
    // Plist missing → false (owned by service health); present but unparseable → true (repair rewrites it)
    return FileManager.default.fileExists(atPath: plistPath)
}

The behavior is correct as written; this is just a minor readability/efficiency note.

2. Component cache may stay stale for up to 15s after the stale-helper fallback (PrivilegedOperationsRouter.swift:107-115)

After sudoInstallRequiredRuntimeServices() succeeds in the stale-helper guard, cachedComponentFacts in SystemValidator still holds vhidDaemonPlistMisconfigured = true until the TTL expires. The PR description mentions invalidateCaches() refreshes it after repairs — if the caller of installRequiredRuntimeServices reliably calls invalidateCaches() afterward, this is fine. But if that call is only on the normal code path (not this fallback), the wizard could briefly show the warning again before the cache refreshes. Worth confirming this is covered end-to-end.

3. Incidental do-catch reformatting in unrelated methods

Several do { } catch { } blocks in PrivilegedOperationsRouter.swift were collapsed from two-line to one-line form (installNewsyslogConfig, repairVHIDDaemonServices, terminateProcess, killAllKanataProcesses, etc.). CLAUDE.md says SwiftFormat 0.61.1 is pinned and master is a formatted fixed-point. If these changes came from running swiftformat with a different version or config they could cause false diff noise on future PRs. If they're intentional style fixes, that's fine — just flagging for awareness.

4. No router-level test for the stale-helper guard path

The WizardPureLogicTests cover the pure-logic detection and action-inclusion paths, which is correct. But the stale-helper guard in PrivilegedOperationsRouter (helper reports success → plist still stale → falls back to sudo) has no unit coverage. This is probably acceptable given the helper is hard to mock, but worth a note in the test plan.


Minor notes

  • vhidDaemonPlistMisconfigured: Bool = false default on ComponentStatus.init is the right choice — existing callers that don't supply the field default to "not misconfigured," which is safe.
  • The log in vhidDaemonPlistContentIsValid correctly fires only on invalid content (not on every validation cycle), avoiding log noise.
  • The suppression gate in SystemInspector (vhidServicesHealthy must be true) is tested with test_inspect_vhidDaemonPlistMisconfiguredAndServicesUnhealthy_emitsSingleIssue. Good coverage of the single-alarm invariant.

Bottom line: The design is solid and the test coverage is good. Items 2 (cache staleness after stale-helper fallback) and 3 (incidental reformats) are the ones worth double-checking before merge. Items 1 and 4 are informational.

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR wires the pre-MAL-57 VHID daemon plist validation into the wizard's health model so existing installs with a stale ProcessType=Interactive-less plist get proactively surfaced with a one-click repair. Previously, isVHIDDaemonConfiguredCorrectly() was only consumed by the repair postflight, meaning old installs never migrated unless repair ran for an unrelated reason.

The change is well-scoped and the reasoning is sound. A few things worth flagging:


Code Quality

Solid decisions:

  • vhidRuntimeServicesNeedRepair computed property as a single source of truth for the installRequiredRuntimeServices trigger is exactly the right design — it ensures repair and install plans can't drift apart. Good.
  • Excluding vhidDaemonPlistMisconfigured from hasAllRequired is correct: a stale plist is a migration nudge, not a blocker. The comment explaining why makes this non-obvious exclusion maintainable.
  • The .warning-not-.error severity decision is well-reasoned: MainAppStateController counts .error issues as blocking, and a running-but-stale daemon shouldn't flip the app to a failed state.
  • Gating the new issue on vhidServicesHealthy to suppress double-alarms is clean — one root cause, one fix button.
  • invalidateHealthCache() in setUp() is a good fix for the cross-suite flake.

Potential Issue: Foundation.FileManager() allocations

isVHIDDaemonPlistPresentButMisconfigured() at line 908 allocates a new FileManager instance inline:

return Foundation.FileManager().fileExists(atPath: plistPath)

This pattern already exists elsewhere in the file (lines 197, 253, 327, 873), so this PR is at least consistent. That said, the conventional approach is FileManager.default, which avoids the allocation and is thread-safe. This is low priority but worth a cleanup pass across the file when touching it next.


Missing Test: Stale-Helper Guard in PrivilegedOperationsRouter

The new stale-helper path in installRequiredRuntimeServices() (helper succeeds → plist still misconfigured → fall back to sudoRepairVHIDServices) has no unit-test coverage. This is understandable — testing it requires mocking helperManager success AND an injected serviceHealthChecker — but the guard is the highest-risk path in this PR (it's the one place where the old infinite-loop scenario could recur if the routing changes). A comment acknowledging this gap in the PR description would be appropriate, or a future test tracking item.


Minor: Log Level Consistency

In PrivilegedOperationsRouter, the stale-helper fallback uses AppLogger.shared.warn(...). The other call sites in the file generally use .log(...). Consistent use of .warn here is actually correct and intentional (it's a degraded but recovered state), but it's worth confirming it surfaces appropriately in whatever log viewer is used for debugging.


Logic Verification

The fallback routing (sudoRepairVHIDServices() not sudoInstallRequiredRuntimeServices()) is correct and the inline comment explains why — routing back through sudoInstallRequiredRuntimeServices would re-enter the helper-first path and loop. sudoRepairVHIDServices() calls ServiceBootstrapper.shared.repairVHIDDaemonServices() directly, bypassing the helper. This is the right call.


Test Coverage

Coverage for the new isVHIDDaemonPlistPresentButMisconfigured() is comprehensive:

  • Missing plist → false
  • Correct plist → false
  • Present, wrong ProcessTypetrue
  • Present, wrong program path → true
  • Present but unparseable → true

Wizard logic tests cover the inspector and action determiner paths well.


Summary

The design is correct and the implementation is clean. The two things I'd flag before merge:

  1. Stale-helper guard has no test — worth a tracking note even if it's hard to unit-test now.
  2. Foundation.FileManager() allocation pattern — not introduced by this PR, but worth noting as a future cleanup.

Everything else looks good. The vhidRuntimeServicesNeedRepair consolidation is a genuine improvement to the invariant surface of the codebase.

@malpern malpern force-pushed the claude/sweet-bhabha-b62002 branch from db59240 to 967b6af Compare June 11, 2026 18:57
…ne-click repair

isVHIDDaemonConfiguredCorrectly() (ProgramArguments path + ProcessType=
Interactive, the MAL-57 starvation fix) was only consumed by the repair
postflight, where it trivially passes because repair just rewrote the
plist. Old installs with a pre-fix plist never migrated unless repair
ran for some other reason.

Wire plist-content validation into the wizard's health model:

- ServiceHealthChecker.isVHIDDaemonPlistPresentButMisconfigured():
  presence-gated content check — a missing plist stays attributed to
  service health ("not installed"), only an existing-but-stale plist
  counts as misconfigured
- ComponentStatus.vhidDaemonPlistMisconfigured (defaulted init param,
  existing construction sites unchanged), populated by
  SystemValidator.checkComponents()
- SystemInspector emits the previously dormant
  .component(.vhidDaemonMisconfigured) issue (category .installation,
  autoFix .installRequiredRuntimeServices) — already consumed by
  KarabinerComponentsStatusEvaluator, WizardRouter, and the Karabiner
  components page repair flow
- ActionDeterminer includes .installRequiredRuntimeServices in repair/
  install plans when the plist is stale even if the daemon is healthy

Health semantics (vhidDaemonServiceHealthy, test-env plist-existence
path) are untouched; state stays .active while the issue drives routing
per the issue-driven WizardRouter, honoring ADR-026 ordering.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@malpern malpern force-pushed the claude/sweet-bhabha-b62002 branch from 967b6af to 7bf7f82 Compare June 11, 2026 18:58
@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

PR Review — Wizard: proactively surface stale pre-MAL-57 VHID daemon plist

Overall this is a well-structured PR. The motivation is clear, the severity choice is correct, the double-alarm prevention is thoughtful, and vhidRuntimeServicesNeedRepair as a single computed property prevents the repair/install trigger from drifting. Good work. A few things worth looking at:


Bug: Foundation.FileManager() should be FileManager.default

ServiceHealthChecker.swiftisVHIDDaemonPlistPresentButMisconfigured()

guard let dict = NSDictionary(contentsOfFile: plistPath) as? [String: Any] else {
    return Foundation.FileManager().fileExists(atPath: plistPath)
}

Foundation.FileManager() allocates a new FileManager instance instead of using the shared singleton. This is harmless in practice (FileManager is stateless for path checks), but it is an unusual pattern that suggests either a naming collision or a copy-paste artefact. If there is a custom FileManager type in scope that motivated the Foundation. prefix, that is worth a comment. Otherwise, please use FileManager.default.


Acknowledged gap: stale-helper guard is untested

The PR description is honest about this, and the pure-logic coverage around it is solid. For what it is worth, the guard is already parameterized by serviceHealthChecker — that injection point is all you need for a unit test:

// Pseudocode
let mockChecker = MockServiceHealthChecker(isVHIDPlistMisconfigured: true)
let router = PrivilegedOperationsRouter(serviceHealthChecker: mockChecker, ...)
// stub helperManager to succeed, verify sudoRepairVHIDServices() is called

If the helperManager mock complexity is the blocker, a narrow integration test against a temp KEYPATH_LAUNCH_DAEMONS_DIR (same pattern as ServiceHealthCheckerTests) would cover the convergence property. Not a blocker for this PR, but worth a follow-up ticket.


Question: post-repair cache invalidation in the stale-helper path

After sudoRepairVHIDServices() runs inside the stale-helper guard, the 15s-TTL ComponentInstallationFacts in SystemValidator still holds vhidDaemonPlistMisconfigured = true. The PR description says the TTL is handled by MainAppStateController.invalidateValidationCooldown() on wizard close.

Two edge cases:

  1. User stays in the wizard. If they click Repair, it succeeds (stale-helper path), but the wizard UI may keep showing the misconfigured issue for up to 15 seconds. Is SystemValidator.invalidateCaches() called anywhere in the repair postflight for the helper path? Or is the assumption that the wizard closes and reopens after repair?

  2. Re-entry. If the user clicks Repair again within the 15s window, the cached fact still says misconfigured, so the repair action is included again. This is idempotent (rewriting the same plist twice is harmless), but it would be good to confirm the router handles that gracefully.

Neither is a correctness bug, but the first could be a confusing UX.


Nit: sudoRepairVHIDServices() vs sudoInstallRequiredRuntimeServices() in the guard

try await helperManager.installRequiredRuntimeServices()
if serviceHealthChecker.isVHIDDaemonPlistPresentButMisconfigured() {
    // ...
    try await sudoRepairVHIDServices()   // throws propagate up, no fallback
}

The comment correctly explains why sudoInstallRequiredRuntimeServices is off-limits here (loops back through the helper). Just confirming the intent: if sudoRepairVHIDServices() throws, the error propagates to the caller with no further fallback. That seems right — a sudo-repair failure should surface as an error rather than silently swallowed — but it is worth making the comment say so explicitly so a future reader does not add a fallback that re-introduces the loop.


Praise

  • Severity .warning is exactly right here. Good call downgrading from .error; working installs should not see a red/failed app state.
  • Gating the wizard issue on vhidServicesHealthy to prevent double-alarming is clean.
  • vhidRuntimeServicesNeedRepair as the single source of truth for ActionDeterminer is a nice guard against future drift.
  • invalidateHealthCache() in setUp is a good fix; singleton state bleed is a real source of flakiness.
  • The vhidDaemonPlistMisconfigured = false default parameter keeps all existing ComponentStatus call sites unmodified — good backward-compatible API hygiene.
  • Test descriptions/comments explain the why (MAL-57 migration path), not just the assertion. That is appreciated.

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Code Review

Overview

This PR closes a real migration gap: the MAL-57 starvation fix (adding ProcessType=Interactive to the VHID daemon plist) was only applied during repair, so existing installs running the old plist were never migrated. The solution is sound: detect the stale plist once per 15s cache cycle, surface a warning-severity issue in the wizard, and offer a one-click repair. The stale-helper guard in PrivilegedOperationsRouter handles the edge case where a resident old helper reports success but actually rewrites the old template.

Overall this is well-structured work. A few items worth discussing:


Finding 1 — Foundation.FileManager() should be FileManager.default

File: ServiceHealthChecker.swift, isVHIDDaemonPlistPresentButMisconfigured()

return Foundation.FileManager().fileExists(atPath: plistPath)

This creates a new FileManager instance on every call. FileManager.default is the conventional singleton and what the rest of the codebase uses. The Foundation. prefix looks like it was added to resolve a naming conflict that may no longer exist. Unless there is local shadowing, this should be FileManager.default.fileExists(atPath: plistPath).


Finding 2 — Stale-helper guard: confirm sudoRepairVHIDServices() runs the configured-postflight

File: PrivilegedOperationsRouter.swift, stale-helper guard

The comment explains why sudoRepairVHIDServices() is chosen over sudoInstallRequiredRuntimeServices() (the latter re-enters the helper-first path and would loop). This is the right call if sudoRepairVHIDServices() includes the same postflight validation that normally confirms the plist is correct after repair. If the postflight is absent in the fallback path, a partially-updated plist could leave the guard in a silently wrong state. The PR description mentions "with its own configured-postflight" for ServiceBootstrapper, but it's not obvious from the diff that the postflight actually runs in this fallback.


Finding 3 — Log verbosity for misconfigured plists is bounded but worth a note (no action required)

File: ServiceHealthChecker.swift, vhidDaemonPlistContentIsValid()

This runs inside the 15s-TTL facts cache, so an affected user sees at most one log line per 15 seconds until they repair. The existing comment says "Logs only when the content is invalid," which is accurate. Noting the TTL bound would help future readers not mistake this for a hot-path log.


Finding 4 — Acknowledged gap: stale-helper guard has no unit test

The PR description calls this out explicitly and the reasoning is sound. Accepted as-is. A // TODO: add integration test for stale-helper fallback (MAL-57) comment in the guard block would signal this was a deliberate gap, not an oversight.


What is well done

  • vhidRuntimeServicesNeedRepair on ComponentStatus is an excellent centralization — repair and install plans can no longer drift, and the property is self-documenting.
  • Gating in SystemInspector (vhidServicesHealthy must be true) correctly prevents double-alarming from one root cause.
  • Default vhidDaemonPlistMisconfigured: Bool = false maintains backward compatibility for all existing callers.
  • Excluding vhidDaemonPlistMisconfigured from hasAllRequired is well-justified: a stale plist still runs day-to-day and should not fail app-wide readiness.
  • Test coverage is thorough: missing plist, correct plist, no-ProcessType, wrong path, corrupt plist, suppressed-when-unhealthy, state-stays-active.
  • Severity .warning (not .error) is clearly reasoned and avoids falsely blocking working installs in MainAppStateController.
  • invalidateHealthCache() in setUp is a good catch for singleton-state cross-suite flakes.

Summary: One minor cleanup (Finding 1: FileManager.default), one confirmation question (Finding 2: postflight in the stale-helper fallback). Everything else is clean. Architecture choices — caching, gating, severity, centralized trigger — are defensible and well-documented.

@malpern malpern merged commit b2f3876 into master Jun 11, 2026
3 checks passed
@malpern malpern deleted the claude/sweet-bhabha-b62002 branch June 11, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant