Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions Sources/KeyPathHelper/HelperService.swift
Original file line number Diff line number Diff line change
Expand Up @@ -879,6 +879,8 @@ extension HelperService {
<true/>
<key>KeepAlive</key>
<false/>
<key>ProcessType</key>
<string>Interactive</string>
<key>StandardOutPath</key>
<string>/var/log/com.keypath.kanata.stdout.log</string>
<key>StandardErrorPath</key>
Expand Down Expand Up @@ -914,6 +916,8 @@ extension HelperService {
<true/>
<key>KeepAlive</key>
<true/>
<key>ProcessType</key>
<string>Interactive</string>
<key>StandardOutPath</key>
<string>/var/log/karabiner-vhid-daemon.log</string>
<key>StandardErrorPath</key>
Expand Down
11 changes: 11 additions & 0 deletions Sources/KeyPathInstallationWizard/Core/PlistGenerator.swift
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,8 @@ public enum PlistGenerator {
<true/>
<key>KeepAlive</key>
<false/>
<key>ProcessType</key>
<string>Interactive</string>
<key>StandardOutPath</key>
<string>/var/log/com.keypath.kanata.stdout.log</string>
<key>StandardErrorPath</key>
Expand Down Expand Up @@ -138,9 +140,16 @@ public enum PlistGenerator {
/// - Automatic restart (KeepAlive)
/// - Logging to /var/log/karabiner-vhid-daemon.log
/// - Throttle protection to prevent rapid restart loops
/// - Interactive process type so the daemon is not starved under CPU load
///
/// This daemon is required for Kanata to access the virtual keyboard device.
///
/// ProcessType=Interactive matches the Kanata daemon plist. Without it,
/// heavy system load (e.g. parallel builds) can starve the daemon long
/// enough to miss the pqrs client's 3s heartbeat, dropping Kanata's output
/// connection mid-keystroke and leaving a key stuck down (autorepeat
/// bursts). See docs/bugs/MAL-57-duplicate-keypresses.md.
///
/// - Returns: Complete plist XML string ready to write to disk
public static func generateVHIDDaemonPlist() -> String {
"""
Expand All @@ -158,6 +167,8 @@ public enum PlistGenerator {
<true/>
<key>KeepAlive</key>
<true/>
<key>ProcessType</key>
<string>Interactive</string>
<key>StandardOutPath</key>
<string>/var/log/karabiner-vhid-daemon.log</string>
<key>StandardErrorPath</key>
Expand Down
14 changes: 10 additions & 4 deletions Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
import Network
import os.lock

/// Handles health checking and status reporting for LaunchDaemon services.

Check warning on line 8 in Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift

View workflow job for this annotation

GitHub Actions / code-quality

A doc comment should be attached to a declaration (orphaned_doc_comment)
///
/// This service provides comprehensive health checking capabilities extracted from
/// supporting both LaunchDaemon and SMAppService paths.
Expand Down Expand Up @@ -316,7 +316,7 @@
let runtimeSnapshot = await checkKanataServiceRuntimeSnapshot()
let decision = Self.decideKanataHealth(for: runtimeSnapshot)
AppLogger.shared.log(
"🔍 [ServiceHealthChecker] Kanata SMAppService decision: \(decision), running=\(runtimeSnapshot.isRunning), responding=\(runtimeSnapshot.isResponding), stale=\(runtimeSnapshot.staleEnabledRegistration), state=\(state.description)"

Check warning on line 319 in Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift

View workflow job for this annotation

GitHub Actions / code-quality

Line should be 200 characters or less; currently it has 249 characters (line_length)
)
return decision.isHealthy
}
Expand Down Expand Up @@ -440,7 +440,7 @@
&& !runtimeSnapshot.staleEnabledRegistration
kanataHealthy = Self.decideKanataHealth(for: runtimeSnapshot).isHealthy
AppLogger.shared.log(
"🔍 [ServiceHealthChecker] Kanata SMAppService-managed: loaded=\(kanataLoaded), healthy=\(kanataHealthy), stale=\(runtimeSnapshot.staleEnabledRegistration), launchctlExit=\(runtimeSnapshot.launchctlExitCode?.description ?? "nil")"

Check warning on line 443 in Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift

View workflow job for this annotation

GitHub Actions / code-quality

Line should be 200 characters or less; currently it has 245 characters (line_length)
)
} else {
// Legacy or unknown - use full checks
Expand Down Expand Up @@ -846,7 +846,7 @@

// Fallback: use the [ERROR] line
for line in lines.reversed() {
if line.contains("[ERROR]") {

Check warning on line 849 in Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift

View workflow job for this annotation

GitHub Actions / code-quality

`where` clauses are preferred over a single `if` inside a `for` (for_where)
if let range = line.range(of: "[ERROR]") {
let raw = String(line[range.upperBound...])
.trimmingCharacters(in: .whitespaces)
Expand All @@ -873,7 +873,12 @@
return Foundation.FileManager().fileExists(atPath: plistPath)
}

/// Verifies that the installed VHID LaunchDaemon plist points to the DriverKit daemon path.
/// Verifies that the installed VHID LaunchDaemon plist points to the DriverKit daemon path
/// and carries the required ProcessType=Interactive key.
///
/// ProcessType=Interactive keeps the daemon from being starved under CPU
/// load (MAL-57 stuck-key autorepeat). A plist without it predates the fix
/// and should be treated as misconfigured so repair rewrites it.
///
/// - Returns: `true` if the plist is correctly configured
public func isVHIDDaemonConfiguredCorrectly() -> Bool {
Expand All @@ -889,11 +894,12 @@
"/Library/Application Support/org.pqrs/Karabiner-DriverKit-VirtualHIDDevice/Applications/Karabiner-VirtualHIDDevice-Daemon.app/Contents/MacOS/Karabiner-VirtualHIDDevice-Daemon"

if let args = dict["ProgramArguments"] as? [String], let first = args.first {
let ok = first == expectedPath
let pathOK = first == expectedPath
let processTypeOK = (dict["ProcessType"] as? String) == "Interactive"
AppLogger.shared.log(
"🔍 [ServiceHealthChecker] VHID plist ProgramArguments[0]=\(first) | expected=\(expectedPath) | ok=\(ok)"
"🔍 [ServiceHealthChecker] VHID plist ProgramArguments[0]=\(first) | pathOK=\(pathOK) | processTypeOK=\(processTypeOK)"
)
return ok
return pathOK && processTypeOK
}
AppLogger.shared.log(
"🔍 [ServiceHealthChecker] VHID plist ProgramArguments missing or malformed"
Expand Down Expand Up @@ -928,7 +934,7 @@
guard let data = try? fileHandle.readToEnd(), !data.isEmpty else {
return nil
}
return String(decoding: data, as: UTF8.self)

Check warning on line 937 in Sources/KeyPathInstallationWizard/Core/ServiceHealthChecker.swift

View workflow job for this annotation

GitHub Actions / code-quality

Prefer failable `String(bytes:encoding:)` initializer when converting `Data` to `String` (optional_data_string_conversion)
}

/// Get the launchd daemons directory path
Expand Down
15 changes: 15 additions & 0 deletions Tests/KeyPathTests/Services/PlistGeneratorTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,21 @@ final class PlistGeneratorTests: XCTestCase {
XCTAssertTrue(plist.contains(PlistGenerator.vhidDaemonPath))
}

/// ProcessType=Interactive prevents the VHID daemon from being starved
/// under CPU load, which drops Kanata's output connection mid-keystroke
/// and causes stuck-key autorepeat bursts (MAL-57).
func testGenerateVHIDDaemonPlistUsesInteractiveProcessType() {
let plist = PlistGenerator.generateVHIDDaemonPlist()

guard let data = plist.data(using: .utf8),
let dict = try? PropertyListSerialization.propertyList(from: data, options: [], format: nil) as? [String: Any]
else {
XCTFail("Generated VHID daemon plist is not valid XML/plist format")
return
}
XCTAssertEqual(dict["ProcessType"] as? String, "Interactive")
}

func testGenerateVHIDDaemonPlistValidXML() {
let plist = PlistGenerator.generateVHIDDaemonPlist()

Expand Down
16 changes: 14 additions & 2 deletions Tests/KeyPathTests/Services/ServiceHealthCheckerTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -77,10 +77,13 @@ final class ServiceHealthCheckerTests: XCTestCase {
FileManager.default.createFile(atPath: url.path, contents: Data(), attributes: nil)
}

private func writeVHIDPlist(programPath: String) throws {
let dict: [String: Any] = [
private func writeVHIDPlist(programPath: String, processType: String? = "Interactive") throws {
var dict: [String: Any] = [
"ProgramArguments": [programPath]
]
if let processType {
dict["ProcessType"] = processType
}
let url = tempLaunchDaemonsDir.appendingPathComponent("\(ServiceHealthChecker.vhidDaemonServiceID).plist")
let data = try PropertyListSerialization.data(fromPropertyList: dict, format: .xml, options: 0)
try data.write(to: url)
Expand Down Expand Up @@ -269,4 +272,13 @@ final class ServiceHealthCheckerTests: XCTestCase {
try writeVHIDPlist(programPath: "/wrong/path")
XCTAssertFalse(checker.isVHIDDaemonConfiguredCorrectly())
}

/// Plists from before the MAL-57 starvation fix lack ProcessType=Interactive
/// and must report misconfigured so repair rewrites them.
func testIsVHIDDaemonConfiguredCorrectlyReturnsFalseWithoutProcessType() throws {
let expectedPath =
"/Library/Application Support/org.pqrs/Karabiner-DriverKit-VirtualHIDDevice/Applications/Karabiner-VirtualHIDDevice-Daemon.app/Contents/MacOS/Karabiner-VirtualHIDDevice-Daemon"
try writeVHIDPlist(programPath: expectedPath, processType: nil)
XCTAssertFalse(checker.isVHIDDaemonConfiguredCorrectly())
}
}
2 changes: 2 additions & 0 deletions dev-tools/debug/debug-plist-validation.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ cat > "$VHID_DAEMON_TEMP" << 'EOF'
<true/>
<key>KeepAlive</key>
<true/>
<key>ProcessType</key>
<string>Interactive</string>
<key>StandardOutPath</key>
<string>/var/log/karabiner-vhid-daemon.log</string>
<key>StandardErrorPath</key>
Expand Down
98 changes: 98 additions & 0 deletions docs/bugs/MAL-57-duplicate-keypresses.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,104 @@ as previous conclusions, not final truth.
Latest investigation summary:

- [`docs/analysis/2026-03-07-duplicate-key-under-load-investigation.md`](/Users/malpern/local-code/KeyPath/.worktrees/duplicate-key-investigation/docs/analysis/2026-03-07-duplicate-key-under-load-investigation.md)
- 2026-06-10 live incident evidence below (strongest capture so far; mechanism confirmed end to end)

## 2026-06-10 Incident Evidence (mechanism confirmed)

Two live incidents were captured with full logs on all three layers (kanata, the
Karabiner VHID daemon, and KeyPath). Both produced multi-second autorepeat bursts
of a single letter while typing ("togggggg…", "hteeeeee…"). Machine context: several
agent worktrees running parallel Swift builds and `quick-deploy.sh` at the time of
both incidents. The KeyPath GUI (and therefore the stuck-key detector/recovery)
was **not** running.

### Timeline (from `/var/log/com.keypath.kanata.stdout.log` and `/var/log/karabiner/virtual_hid_device_service.log`)

Incident 1 — 21:42:17 (the "g" burst):

```
driver connected: false / driver connected: true (repeated flapping for minutes prior)
21:42:17.3224 [WARN] output backend unavailable during write — releasing input devices
21:42:17.3265 [WARN] output backend suspended: recovery-entry
21:42:18.4464 [INFO] output backend and console session ready — re-grabbing input devices
21:42:18.5345 [KeyInput] sent key=backspace ... <- user deleting the repeated g's
21:42:18.7421 managed repeat BSpace (x8)
```

Incident 2 — 22:00:08-09 (the "e" burst). Kanata logged the user typing
"check hte logs … and", then:

```
22:00:08.5099 [KeyInput] sent key=e action=Press
22:00:08.6263 [KeyInput] sent key=e action=Release <- written into a dying socket; never took effect
22:00:09.6736 [KeyInput] sent key=d action=Press
connected <- pqrs client reconnected
22:00:09.7820 [WARN] output backend unavailable during write — releasing input devices
22:00:09.7821 [WARN] dropping KEY_D Release: output backend unavailable (will recover)
driver connected: false
driver connected: true
22:00:14.2039 [INFO] output backend and console session ready — re-grabbing input devices
```

Daemon side (`virtual_hid_device_service.log`): new client socket connected at
22:00:10.150; the old client (and with it the virtual keyboard state holding the
stuck key) was not torn down until 22:00:11.378-11.744. From the lost `e` release
at 22:00:08.6 that is ~3 seconds of OS-level autorepeat.

Both the dext (pid 736) and the Karabiner-VirtualHIDDevice-Daemon (pid 814) ran
continuously through both incidents — no crashes. The failure is purely the
socket between kanata and the daemon.

### Confirmed causal chain

1. Kanata talks to the VHID daemon via the pqrs client in the `karabiner-driverkit`
crate. The client pings the daemon every 3s and declares the connection dead on a
missed heartbeat deadline (`virtual_hid_device_service/client.hpp`:
`set_server_check_interval(3000ms)`, `next_heartbeat_deadline_exceeded`).
2. Under CPU load the heartbeat misses and the connection drops. Key releases
in flight are lost: either written into the dying socket (the `e`) or
explicitly dropped by `drop_if_sink_disconnected` in
`External/kanata/src/oskbd/macos.rs` (the `d` — `dropping KEY_D Release`).
3. The virtual HID keyboard keeps the key logically down until the daemon tears
down the old client (~1.2s after the new client connects). macOS autorepeats
the stuck key for the whole window.
4. The fork's recovery (`release_tracked_output_keys` in
`External/kanata/src/kanata/macos.rs`) runs at re-grab — *after* the damage
window has already closed on its own, so it cannot shorten the burst.

### Why the daemon starves: launchd priority asymmetry

The kanata daemon plist sets `ProcessType=Interactive`; the KeyPath-installed
`com.keypath.karabiner-vhiddaemon.plist` set **no** `ProcessType`, so under load
macOS may deprioritize exactly the process that must answer heartbeats and
process key reports.

### Fix layers

- **Layer 1 (done, this repo):** `ProcessType=Interactive` added to the VHID
daemon plist in `PlistGenerator.generateVHIDDaemonPlist()`, the helper's
copy in `HelperService.swift`, the legacy kanata generators (the shipped
SMAppService plist `Sources/KeyPathApp/com.keypath.kanata.plist` already had
it), and the debug script template. `isVHIDDaemonConfiguredCorrectly()` now
also requires the key, so a pre-fix plist reports misconfigured and repair
rewrites it (`installOrRepairVHIDServices` / `repairVHIDDaemonServices`
rewrite unconditionally). Note: that check currently feeds only the repair
postflight — wiring plist-content validation into `getServiceStatus()` so the
wizard proactively flags stale plists on old installs is a follow-up.
Expected effect: fewer heartbeat-miss disconnects, so far fewer incidents.
- **Layer 2 (open, kanata fork):** shrink the autorepeat window. On reconnect the
client's `connected` callback only calls `virtual_hid_keyboard_initialize`;
also issuing `virtual_hid_keyboard_reset` would clear stuck keys immediately
(~10.2s in the timeline above instead of 11.7s). Force-closing the old client
object as soon as `sink_ready` flips false would help further.
- **Layer 3 (open, kanata fork):** never silently drop a Release.
`drop_if_sink_disconnected` treats all events equally; dropped Releases are the
toxic case and should be remembered and replayed after reconnect
(`output_pressed_since` tracking already exists).

Residual gap no fix can fully close: a release written into a socket that is
dying but not yet declared dead is unrecoverable in transit; Layer 2 bounds the
damage to the reconnect latency.

## Problem Statement

Expand Down
Loading