Skip to content

feat: thread device identity for per-device switch conditions#5

Open
malpern wants to merge 2 commits into
mainfrom
feat/device-identity-pipeline
Open

feat: thread device identity for per-device switch conditions#5
malpern wants to merge 2 commits into
mainfrom
feat/device-identity-pipeline

Conversation

@malpern

@malpern malpern commented Mar 13, 2026

Copy link
Copy Markdown
Owner

Summary

  • Adds device_index: u8 field to KeyEvent, threaded from Linux KbdIn through to keyberon switch evaluation
  • On Linux, each registered input device gets a monotonically increasing index assigned at registration time
  • New (device N) switch condition in the parser that matches events from a specific physical device
  • Stores current_device: Option<u8> on Layout (not in keyberon Event row) to preserve the row 0/1 real/virtual key invariant
  • macOS/Windows compile cleanly with device_index: 0 default (no multi-device support yet)

Usage Example

(switch
  ((device 0) X break)
  ((device 1) Y break))

Not in scope (follow-up PRs)

  • Name-based matching (device "Kinesis") / (defdevice ...) config
  • Per-device layer stacks
  • Windows/macOS multi-device support
  • Device hot-plug stable indexing

Test plan

  • cargo build succeeds on all targets
  • All existing keyberon tests pass (82 tests including 3 new device tests)
  • All existing parser tests pass (100 tests)
  • All existing kanata lib tests pass (28 tests)
  • CI green on Linux, macOS, Windows, Android
  • Local KeyPath integration testing

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Code Review — PR #5: Thread device identity for per-device switch conditions

Overall this is a well-structured PR that threads device identity cleanly through the stack. The two main features (device-indexed switch conditions and per-action require-prior-idle overrides) are both useful and well-tested. The changes below are mostly design and correctness concerns, ordered roughly by severity.


High Priority

1. Unexplained crate-type = ["rlib", "staticlib"] in Cargo.toml

# Cargo.toml
crate-type = ["rlib", "staticlib"]

Adding staticlib is not mentioned in the PR description, and it has significant implications:

  • LGPL compliance: kanata uses LGPL-3.0-only. Distributing a staticlib means users must be able to re-link — this is a license-compliance concern that should be deliberate and documented.
  • Build output: staticlib produces a large .a archive that's compiled even for normal cargo build runs.
  • Linking behavior: Downstream projects linking the staticlib will bundle all of kanata's transitive dependencies.

If this is needed for an external FFI use case, it should be in a dedicated PR with justification. If it's accidental, it should be reverted.


2. Two parallel HashMaps for device info can become inconsistent

// src/oskbd/linux.rs
pub struct KbdIn {
    devices: HashMap<Token, (Device, Path)>,
    device_indices: HashMap<Token, u8>,   // ← separate map for same key
    next_device_index: u8,
}

These two maps share the same Token key but are maintained independently. When a device is removed (e.g., unplugged), devices is cleaned up in rediscover(), but device_indices is never pruned. Over time (especially with hot-plug), stale token→index mappings accumulate in device_indices. Also, the unwrap_or(0) fallback in read() silently assigns device index 0 to events from tokens not in device_indices, which masks bugs:

let dev_idx = self
    .device_indices
    .get(&event.token())
    .copied()
    .unwrap_or(0);  // ← silently wrong, not just a default

Suggestion: Store the device index alongside the device in devices:

devices: HashMap<Token, (Device, Path, u8)>,

This eliminates the second map and the consistency risk. If the index must be separate for other reasons, the unwrap_or(0) should be replaced with a logged warning or an error.


Medium Priority

3. saturating_add silently aliases devices at index 255

// src/oskbd/linux.rs
self.next_device_index = self.next_device_index.saturating_add(1);

If more than 256 keyboards are registered (unlikely in practice, but possible in automated test setups or server environments), all devices registered after index 255 will silently share index 255. A user writing (device 255) would then match all of those devices inadvertently.

A more explicit approach would be to log a clear warning and assign a sentinel value (or refuse to register the device with an index):

match self.next_device_index.checked_add(1) {
    Some(next) => { self.next_device_index = next; }
    None => { log::warn!("device index overflow; device will use index 255"); }
}

4. assert! in OpCode::new_device can panic the keyboard daemon

pub fn new_device(device_idx: u16) -> (Self, Self) {
    assert!(device_idx <= u8::MAX as u16);
    (Self(DEVICE_VAL), Self(device_idx))
}

The parser already validates the range before calling this function, so the assert is defensive. However, panicking in a keyboard daemon is particularly disruptive — it will kill the user's keyboard remapping mid-session. Since the function already takes a u16, consider taking a u8 directly (matching the semantic constraint), or returning a Result:

pub fn new_device(device_idx: u8) -> (Self, Self) {
    (Self(DEVICE_VAL), Self(device_idx as u16))
}

This makes the type system enforce the constraint and removes the possibility of a panic.


5. #[allow(clippy::too_many_arguments)] on a hot-path function

#[allow(clippy::too_many_arguments)]
pub fn actions<A1, A2, H1, H2, L>(
    &self,
    active_keys: A1,
    historical_keys: A2,
    historical_positions: H2,
    layers: L,
    default_layer: u16,
    current_device: Option<u8>,   // ← 8th argument after this PR
) -> ...

And similarly evaluate_boolean. Eight generically-typed arguments is hard to read and maintain, and suppressing the lint is a sign the design could be improved. A context struct would be cleaner and make future additions (e.g., per-device layer stacks) much easier:

pub struct SwitchContext<'a, A1, A2, H1, H2, L> {
    pub active_keys: A1,
    pub historical_keys: A2,
    // ...
    pub current_device: Option<u8>,
}

That said, this is a refactor that could be a follow-up — the current approach works correctly.


6. macOS recovery changes appear out-of-scope

The macos.rs and oskbd/macos.rs changes are substantial:

  • Moving wait_until_ready from KbdIn initialization into the event loop
  • Adding output_ready() / wait_until_ready() to KbdOut
  • Adding output_pressed_since: HashMap<OsCode, Instant> key tracking
  • Adding release_tracked_output_keys()
  • Clearing PRESSED_KEYS after recovery

These changes are logically independent from device identity threading and deserve their own PR with dedicated testing and rationale. Mixing them here makes the review harder and the git history less navigable.

One specific concern: write_key tracks output keys via record_output_transition_after_write, but the lower-level write method does not. Code paths that call write directly (e.g., write_code) bypass key tracking, so release_tracked_output_keys will not release those keys during recovery.


Low Priority

7. DeviceInfo struct is dead code

// src/oskbd/linux.rs
pub struct DeviceInfo {
    pub name: String,
    pub index: u8,
}

pub fn device_info(&self) -> Vec<DeviceInfo> { ... }

This is defined and implemented but never called. If it's a hook for the planned name-based matching follow-up, a comment saying so would help. Otherwise, it should wait for the PR that actually uses it (YAGNI).


8. current_device may not reflect the originating device for deferred actions

// src/kanata/mod.rs
pub fn handle_input_event(&mut self, event: &KeyEvent) -> Result<()> {
    self.layout.bm().current_device = Some(event.device_index);
    // ...tap-hold state machine runs here, potentially resolving a
    // *previous* key event's hold timeout triggered by *this* event
}

When a hold-tap timeout resolves because another key was pressed (e.g., HoldOnOtherKeyPress), current_device reflects the triggering key, not the original tap-hold key's device. If a user writes:

(switch
  ((device 0) lctl break)
  (_ lctl break))

…and a tap-hold key on device 1 resolves due to a key press on device 0, the switch condition will evaluate against device 0. Whether this is correct depends on the semantic intent (which device "caused" the action), but it's a behavioral subtlety worth documenting or deciding explicitly.


Positive Observations

  • The with_device(idx: u8) -> Self builder pattern on KeyEvent is clean and ergonomic.
  • The invariant of not storing device info in the Event row (to preserve row 0/1 semantics) is a good architectural decision, and the current_device field on Layout is a reasonable place for it.
  • Test coverage for the new features is thorough — both unit tests for opcode encoding/evaluation and integration sim tests for per-action require-prior-idle.
  • The fallback behavior (device_index: 0 on non-Linux) is sensible and makes configs forward-compatible.
  • The count_trailing_options heuristic for detecting trailing option lists is clever and avoids breaking backward compatibility.

Summary

Issue File Severity
Unexplained staticlib crate type Cargo.toml High
Two parallel HashMaps; stale entries on unplug oskbd/linux.rs High
saturating_add silently aliases 256+ devices oskbd/linux.rs Medium
assert! can panic the daemon keyberon/src/action/switch.rs Medium
8-arg function suppresses clippy keyberon/src/action/switch.rs Medium
macOS changes out-of-scope macos.rs, oskbd/macos.rs Medium
DeviceInfo dead code oskbd/linux.rs Low
current_device timing for deferred actions kanata/mod.rs Low

🤖 Generated with Claude Code

@malpern

malpern commented Mar 13, 2026

Copy link
Copy Markdown
Owner Author

Parent feature: malpern/KeyPath#63 (per-keyboard rule targeting)

Follow-up issues: #6, #7, #8, #9

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Code Review: feat: thread device identity for per-device switch conditions

This PR adds two distinct features plus macOS output-backend refactoring in a single diff (~1000 additions). The implementation is generally solid, but there are a few correctness concerns worth addressing before merge.


Critical / Correctness

1. Race condition: current_device can be stale when a switch is evaluated inside tap-hold

handle_input_event writes self.layout.bm().current_device = Some(event.device_index) immediately before processing, but switch conditions inside a WaitingState (tap-hold) are evaluated during a later tick() call — triggered by a different key event. By that time current_device already reflects the newer event's device.

Concrete scenario:

Device 0 presses a tap-hold key → enters WaitingState
Device 1 presses another key → `current_device` is now 1
Timeout fires → switch evaluating `(device 0)` incorrectly sees device 1

Storing current_device on the WaitingState at creation time would fix this, but you explicitly noted that threading it through the keyberon Event row would break the row-0/row-1 invariant. A practical alternative is storing it alongside the WaitingState in the kanata-layer wrapper, or accepting this limitation and documenting it clearly.


2. TryFrom<InputEvent> for KeyEvent always sets device_index: 0 on Linux

// src/oskbd/linux.rs
evdev::EventSummary::Key(...) => Ok(Self {
    code: ...,
    value: ...,
    device_index: 0,  // always 0
}),

The device index is correctly overwritten by .with_device(device_idx) in the event loop. However, any caller that uses TryFrom directly (e.g. in tests, or future platform ports) will silently get device 0. Consider removing the field from TryFrom and requiring callers to call .with_device(), or at minimum add a #[doc] note that the field is a stub.


Moderate Concerns

3. Cargo.toml: staticlib added without explanation

crate-type = ["rlib", "staticlib"]

Adding staticlib causes every cargo build (including CI) to link a C-compatible static library, meaningfully increasing build and link times for all contributors. This looks unrelated to the device-identity feature and doesn't appear in the PR description. If it's needed for an FFI consumer, please document the motivation; otherwise it should be a separate PR.

4. DeviceInfo struct is defined but never called

KbdIn::device_info() returns a Vec<DeviceInfo> but no code in the diff calls it. Dead public API can cause confusion. Either add #[allow(dead_code)] with a comment explaining future use, or remove it until it's needed (the device_index field on the tuple is sufficient for current functionality).

5. Device-index overflow silently aliases index 255

match self.next_device_index.checked_add(1) {
    Some(next) => self.next_device_index = next,
    None => log::warn!("device index overflow: device {path} shares index 255 ..."),
}

When overflow occurs, next_device_index stays at 255 and all subsequent devices get index 255. The warning is helpful, but consider logging the device path of whichever device already holds 255 so the user knows which two devices will be conflated.

6. count_trailing_options scan-from-end heuristic

The function stops at the first non-option expression scanning from the right. This works for the current single-option case, but it fails silently if an option list is accidentally placed before a positional argument — it would count 0 options, treating the option list as a positional param and likely producing a confusing error. A comment explaining this invariant (or a validation pass that checks for out-of-order options) would improve robustness.


Minor / Style

7. #[allow(clippy::too_many_arguments)] added to two call sites

Rather than suppressing the lint, the evaluate_boolean context (key_codes, historical_inputs, layers, default_layer, current_device…) could be collected into a small EvalContext<'_> struct. This is already a hot path and grouping related data would make future additions (e.g., timestamp, window info) cleaner.

8. parse_u8_with_range(&l[1], s, "device-index", 0, 255) — range is a no-op

The range 0–255 is the full u8 range; an invalid value would already fail the u8 parse. If the function always clamps to [0, 255], this is fine, but if it's doing extra bounds checking it's redundant. Minor nit.

9. (device N) negation not tested

The Not operator is architecturally supported (it's applied after evaluating any sub-expression), but there's no test for (not (device N)). Consider adding one to lock in the behaviour.

10. PR scope

This PR bundles three logically independent changes:

  1. Device identity threading + (device N) switch condition
  2. Per-action require-prior-idle override
  3. macOS DriverKit output-backend refactor (output_ready, wait_until_ready, release_tracked_output_keys)

Each of these is a self-contained, reviewable unit. Splitting them would make review easier, reduce risk of regressions, and improve git history. I understand this may be impractical at this stage, but it's worth noting for future work.


What's Well Done

  • The Option<u8> design for current_device on Layout cleanly preserves the row-0/row-1 invariant.
  • The with_device(idx) builder method is idiomatic and makes the threading explicit at each call site.
  • The opcode encoding roundtrip test (device_opcode_encoding_roundtrip) is exactly the right kind of unit test for a compact encoding scheme.
  • Per-action require_prior_idle is documented thoroughly and the test cases cover the important edge cases (disable-via-zero, per-action-without-global, mixed-actions).
  • macOS/Windows/WASM stubs all correctly default to device_index: 0 so the feature degrades gracefully.
  • The overflow warning in KbdIn::register_device is a good defensive measure.

Overall this is solid work. The main blocker I'd want addressed before merge is item 1 (stale current_device during tap-hold evaluation) — even if the fix is just a documentation note that (device N) inside tap-hold switch conditions has undefined behaviour when multiple devices are active simultaneously.

🤖 Generated with Claude Code

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Code Review — PR #5: Thread device identity for per-device switch conditions

Overall this is a well-structured PR with good test coverage. The two features (per-device (switch (device N) …) and per-action (require-prior-idle …)) are cleanly separated. Below are issues ranging from correctness to minor style suggestions.


🔴 Correctness / Potential Issues

1. Stale current_device in WaitingState (known, but worth scoping clearly)

The doc comment in mod.rs notes that current_device is set per-event and may be stale when tick() resolves a tap-hold waiting state. This is a real hazard: if a user writes a switch inside a tap-hold's hold/tap action that uses (device N), it will evaluate against the device of whichever event was most recently processed, not the device that triggered the tap-hold. The comment is good, but the config docs (docs/config.adoc) should also warn about this, since users won't read Rust source before writing configs.

2. Device index silently saturates at 255

// src/oskbd/linux.rs
None => log::warn!(
    "device index overflow: device {path} shares index 255 with another device"
),

After overflow, next_device_index stays at 255 and every subsequent device also receives index 255 — silently, after the one-time warning. For a user with many HID devices (common on developer machines), two devices unexpectedly sharing index 255 will make (device 255) match both. Consider saturating and logging on every registration after overflow, or returning an error for the 256th device.

3. TryFrom<InputEvent> always sets device_index: 0, then it's overwritten

In src/oskbd/linux.rs:

impl TryFrom<InputEvent> for KeyEvent {
    // ...
    Ok(Self { code, value, device_index: 0 })  // ← always 0

Then immediately at the call site in src/kanata/linux.rs:

Ok(ev) => ev.with_device(device_idx),

The device_index: 0 in the TryFrom impl is a lie — it's always overridden. This is confusing and could lead to future callers of TryFrom getting a silently wrong device_index. Consider removing device_index from TryFrom<InputEvent> and having callers always explicitly set it, or making it a constructor parameter.


🟡 Design / API Concerns

4. crate-type = ["rlib", "staticlib"] changes compilation for all users

# Cargo.toml
crate-type = ["rlib", "staticlib"]

Adding staticlib means Cargo will build a C-compatible static archive every time the crate is compiled, even for users who just want the binary. This adds compile time and artifact size with no benefit to most users. If this is needed for a specific integration (iOS, Swift, FFI?), it should be behind a feature flag or a separate crate.

5. DeviceInfo is dead code / scaffolding

#[derive(Debug, Clone)]
#[allow(dead_code)]
pub struct DeviceInfo { pub name: String, pub index: u8 }

Prefer adding infrastructure when it's actually used. The #[allow(dead_code)] is a code smell here. The device_info() method is also dead code. Add both in the follow-up PR that needs them.

6. Switch::actions / evaluate_boolean parameter sprawl

#[allow(clippy::too_many_arguments)]
fn evaluate_boolean(, current_device: Option<u8>) -> bool {

Both functions now have 8+ parameters. The #[allow(clippy::too_many_arguments)] suppression is a sign that a refactor would help. A SwitchContext<'_> struct grouping active_keys, historical_keys, layers, default_layer, and current_device would make call sites readable and make future additions (e.g. device name, timestamp) free.


🟡 Hot-Path Performance

7. No measurable hot-path concern, but note the Option<u8> comparison

The current_device == Some(device_idx as u8) in evaluate_boolean is fine — it compiles to two comparisons and is branch-predictor-friendly. No concern here.

8. release_tracked_output_keys double-clears the map

pub fn release_tracked_output_keys(&mut self, reason: &str) {
    // ...
    for key in tracked_keys {
        self.write_key(key, KeyValue::Release)?;  // calls record_output_transition_after_write → removes from map
    }
    self.output_pressed_since.clear();  // redundant, map is already empty
}

write_key calls record_output_transition_after_write which removes each released key from output_pressed_since. The final .clear() is harmless but dead. This is minor but shows the two bookkeeping paths aren't obviously consistent. Consider either bypassing record_output_transition_after_write in the recovery path (skip tracking during recovery) or add a comment explaining why .clear() is there as a safety net.


🟢 Minor / Style

9. Hot-plug instability not documented in user-facing config docs

The PR summary mentions that device indices are not stable across hot-plug, but docs/config.adoc (which you've already updated) doesn't mention this. A one-line note near the (device N) example like "Device indices are assigned at startup in registration order and reset on hot-plug" would prevent user confusion.

10. parse_u8_with_range(&l[1], s, "device-index", 0, 255) — full range redundant

The range 0..=255 is the full u8 domain. This call could just be parse_u8(&l[1], s, "device-index") if that function exists, making the intent clearer.

11. Missing parser-level integration test for (switch ((device N) …))

The unit tests in switch.rs cover opcode encoding and evaluate_boolean well. But there's no test in the parser's test suite (e.g. parser/src/cfg/tests/) for a full config containing (switch ((device 0) x break)). A round-trip parse test would catch regressions in parse_switch_case_bool and serve as a usage example.


Summary

# Severity File Issue
1 🔴 docs/config.adoc No user-facing warning about stale device in tap-hold context
2 🔴 src/oskbd/linux.rs Silent index 255 sharing on overflow
3 🟡 src/oskbd/linux.rs TryFrom sets device_index=0 then caller overwrites it
4 🟡 Cargo.toml staticlib crate type affects all builds
5 🟡 src/oskbd/linux.rs Dead DeviceInfo struct with #[allow(dead_code)]
6 🟡 keyberon/src/action/switch.rs Too-many-arguments suppression; SwitchContext struct would help
8 🟢 src/oskbd/macos.rs Redundant .clear() in release_tracked_output_keys
9 🟢 docs/config.adoc Hot-plug instability not documented
11 🟢 parser tests No integration test for (switch ((device N) …))

The per-action require-prior-idle feature is clean — the count_trailing_options approach is well-designed and the parser tests are comprehensive. The device threading plumbing through Linux KbdIn is also solid. Main asks before merge: address items 2 (overflow behavior) and 4 (staticlib), and add the user-facing warning for item 1.

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Code Review

Overall this is solid, well-structured work. The device-identity pipeline is cleanly threaded from KbdIn through to switch evaluation, backward compatibility is preserved, and the test coverage is good. A few things worth addressing before merge:


1. PR scope — two unrelated features bundled together

The diff contains three independent changes:

  • Device identity pipeline ((device N) switch condition) — the stated goal
  • Per-action require_prior_idle override for tap-hold — an entirely separate feature
  • macOS output-backend refactoring (removing hardcoded 10 s wait, adding wait_until_ready/release_tracked_output_keys)

Combining these makes review harder, complicates bisection, and obscures which change broke something if a regression shows up. Consider splitting into separate PRs.


2. crate-type = ["rlib", "staticlib"] in Cargo.toml (needs justification)

crate-type = ["rlib", "staticlib"]

Adding staticlib to the main kanata_state_machine crate has non-obvious consequences:

  • Compiling to a static lib duplicates the Rust standard library into the .a artifact (it's not shared via the rlib path).
  • cargo test interacts differently with staticlib crates.
  • CI probably passes because tests use the rlib artifact, but downstream consumers linking the .a will pull in duplicate runtime code.

What is this for? If it's for an FFI boundary (e.g., the macOS DriverKit side), that should be documented here and ideally isolated in a thin wrapper crate rather than making the whole state machine a static lib.


3. Device index becomes stale during tap-hold resolution (known but underspecified impact)

// src/kanata/mod.rs
pub fn handle_input_event(&mut self, event: &KeyEvent) -> Result<()> {
    self.layout.bm().current_device = Some(event.device_index);
    // ...
}

The code comment acknowledges this correctly. But the impact is worse than "stale": if a user has a tap-hold key on device 0 and types a key on device 1 before the tap-hold timer fires, the switch condition evaluates against device 1 — the resolving key's device, not the tap-hold key's device. The feature is effectively unusable in combination with tap-hold when two devices are present.

This is a known limitation, but I'd suggest:

  • Documenting it in docs/config.adoc alongside the (device N) examples ("Note: (device N) conditions on switch actions inside tap-hold may evaluate against the wrong device when multiple devices are in use")
  • Adding a test that asserts/demonstrates the current behavior so future changes don't silently change the semantics

4. Device index instability after hot-plug

// src/oskbd/linux.rs — register_device
let device_idx = self.next_device_index;
match self.next_device_index.checked_add(1) {
    Some(next) => self.next_device_index = next,
    None => log::warn!("device index overflow: ..."),
}

next_device_index is never reset. When a device disconnects and reconnects, rediscover_devices calls register_device again — assigning the device a new, higher index (e.g., index 2 for a device the user always thought of as index 0).

This makes (device N) conditions silently break after any hot-plug event. A user's config that worked at startup ((device 0) for their main keyboard) will stop working after unplugging and replugging it.

Possible mitigations (pick one):

  • Document this limitation clearly (simplest for this PR)
  • Track path → index in a separate map and reuse the same index on re-registration
  • Add a device-path or device-name syntax (the planned follow-up)

5. macOS recovery loop holds the mutex during polling

// src/kanata/macos.rs — recovery wait loop
loop {
    if kanata
        .lock()
        .kbd_out
        .wait_until_ready(Some(Duration::from_millis(500)))
    {

Arc<Mutex<Kanata>>::lock() acquires the mutex and holds it for up to 500 ms per iteration while wait_until_ready polls. During recovery, any other thread that needs the kanata lock (e.g., a signal handler or the TCP server) will be blocked.

The original code called is_sink_ready() (a lock-free global) after sleep(500ms). The new version should similarly release the lock between polls:

loop {
    std::thread::sleep(Duration::from_millis(50));
    if kanata.lock().kbd_out.output_ready() {
        break;
    }
}

6. OpCodeType::Device(u16) should be Device(u8) for clarity

enum OpCodeType {
    // ...
    Device(u16),
}

Device carries a device index that is always 0–255. Storing it as u16 is an artifact of the opcode encoding. The evaluation already truncates back to u8:

ret = current_device == Some(device_idx as u8);

Using Device(u8) in the enum would make the truncation explicit at the decode site rather than silently at use. Minor, but improves clarity.


7. DeviceInfo struct — keep or defer?

#[allow(dead_code)]
pub struct DeviceInfo { pub name: String, pub index: u8 }

This is harmless scaffolding, but suppressing dead_code on a public struct masks the fact that the feature isn't wired in yet. Two options:

  • Remove it entirely and reintroduce in the name-based matching follow-up PR
  • Keep it but add a #[doc(hidden)] with a comment pointing to the tracking issue, so the intent is clear

Minor nits

  • README.md list indentation change: The "Application-aware layer switching" sub-list was reformatted from 4-space to 3-space indentation. Unrelated to the feature; creates noise in the diff.
  • parse_u8_with_range(…, 0, 255): The full u8 range is the range, so the validation is tautological. A comment explaining the parser enforces u8 semantics (and can reject fractional/negative values) would clarify intent.
  • count_trailing_options name: The function only counts trailing options of known tap-hold keywords. count_trailing_tap_hold_options or a doc comment would avoid confusion.

What's done well

  • The opcode encoding fits cleanly into the existing u16 opcode scheme without changing the binary format.
  • with_device() builder pattern is a nice, zero-cost way to tag events without touching all the call sites.
  • Per-action require_prior_idle parsing is well-isolated; the count_trailing_options approach correctly preserves backward compatibility for all existing configs.
  • Test coverage for the new switch condition, encoding roundtrip, and require_prior_idle edge cases is thorough.
  • The documented limitation in handle_input_event is appreciated — better to be honest about the constraint than to pretend it doesn't exist.

Blocking before merge: issues 3 (documentation), 4 (hot-plug behavior needs at minimum a doc note), and 5 (mutex hold during polling).
Should fix: issue 2 (staticlib justification).
Nice to have: issues 1, 6, 7.

@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

PR Review: feat: thread device identity for per-device switch conditions

This PR adds two distinct features: (1) a (device N) switch condition with Linux multi-device support, and (2) per-action require-prior-idle override for all tap-hold variants. Both are well-scoped and the test coverage is solid. Here are my findings organized by severity.


🔴 Issues Worth Addressing Before Merge

1. Unexplained staticlib in Cargo.toml

crate-type = ["rlib", "staticlib"]

Adding staticlib generates a .a file on every build. This meaningfully increases build times and artifact sizes for all users. It appears to be scaffolding for a potential FFI consumer, but there's no explanation in the PR description or a code comment. If this is intentional, please document why (e.g., a follow-up FFI PR). If it's experimental, it should be removed or gated behind a feature flag.

2. Stale current_device During Tap-Hold Resolution

The code comment in src/kanata/mod.rs already flags this:

"Note: current_device is set here per-event and may be stale when switch actions evaluate later (e.g. inside tap-hold's WaitingState resolution from tick()). This is a known limitation..."

This is more than a minor caveat — it means (device N) conditions inside a tap-hold's action (e.g., a switch nested in the hold action) will silently evaluate against the last key pressed, not the key that triggered the tap-hold. Users could set up a config that appears to work in testing but fails intermittently in fast typing.

The PR description mentions this in "Not in scope" but the user-facing docs (docs/config.adoc) don't warn about it. Consider adding a note in the device documentation section explaining this limitation.


🟡 Medium-Priority Issues

3. Dead Code in Production: DeviceInfo and device_info()

In src/oskbd/linux.rs:

#[allow(dead_code)]
pub struct DeviceInfo { ... }

#[allow(dead_code)]
pub fn device_info(&self) -> Vec<DeviceInfo> { ... }

Scaffolding for follow-up work is fine, but suppressing dead-code warnings in production code is a code smell. Options:

  • Remove and re-add in the follow-up PR that actually uses it
  • Gate behind a non-default feature flag
  • At minimum, add a comment like // Used by follow-up PR #6 (device name matching)

4. Device Index Overflow Silently Aliases Index 255

match self.next_device_index.checked_add(1) {
    Some(next) => self.next_device_index = next,
    None => log::warn!(
        "device index overflow: device {path} shares index 255 with another device"
    ),
}

Note that device_idx is assigned before the checked_add, so the 256th device gets 255, and from then on all subsequent devices also get 255. This means (device 255) would match any device beyond the 255th, which is probably not what a user would expect. Consider logging at error level and/or rejecting the device registration outright.

5. Simulation Test Syntax Limited to Devices 0–9

The new d0:key / u0:key syntax only supports single-digit device indices:

if dev_press.len() == 2 && dev_press.starts_with('d') && dev_press.as_bytes()[1].is_ascii_digit()

Since device_index is u8 (0–255), you can configure (device 10) in a config but can't write simulation tests that trigger from device 10+. This is fine as a first cut, but worth a TODO comment so it's not forgotten.

6. output_pressed_since: HashMap<OsCode, Instant>Instant Is Unused

In src/oskbd/macos.rs, the Instant value is recorded but never read — the map only tracks which keys are pressed, not when. A HashSet<OsCode> would be simpler and more efficient. If the Instant is intended for future use (e.g., stuck-key detection), add a comment explaining that.


🟢 Minor / Style

7. TryFrom<InputEvent> for KeyEvent Sets device_index: 0 Redundantly on Linux

On Linux, the TryFrom impl sets device_index: 0 and the caller immediately overwrites it with .with_device(device_idx). Not a bug, but worth a brief comment noting the field is always overwritten by the caller.

8. Magic Number in Test Opcode Construction

In parser/src/cfg/tests.rs, the And opcode uses magic number 4 as the end-index. A brief comment explaining the span (e.g., // And spans indices 0-3, end=4) would help future contributors understand the test.


✅ What's Done Well

  • Architecture: Storing current_device on Layout (not in the keyberon Event) correctly preserves the row 0/1 real/virtual key invariant.
  • Opcode encoding: DEVICE_VAL = 855 fits cleanly alongside LAYER_VAL and BASE_LAYER_VAL. The roundtrip tests validate encoding correctly.
  • count_trailing_options(): The trailing-options scanning approach for require-prior-idle is clean and correctly handles positional action lists.
  • Overflow detection: Using checked_add instead of wrapping is the right instinct.
  • Test coverage: The per-action require-prior-idle tests are excellent — they cover all 8 tap-hold variants, zero-value (disable), global-only, per-action-only, and mixed scenarios. The sim_switch_device tests validate end-to-end behavior.
  • Cross-platform: Default device_index: 0 and clean compile on Windows/macOS/Android is handled correctly throughout.
  • macOS refactor: Moving wait_until_ready() and output_ready() onto KbdOut is cleaner than the previous scattered is_sink_ready() calls. The release_tracked_output_keys during recovery is a good safety measure.

Summary

The two core features (device switch condition, per-action idle override) are well-implemented with solid tests. The main concerns are: (1) the unexplained staticlib addition that affects all users' build times, (2) the stale current_device limitation needs better user-facing documentation, and (3) some dead code scaffolding to clean up. Happy to discuss any of these points.

🤖 Generated with Claude Code

Thread device identity from OS input through the event pipeline to
keyberon's switch evaluation. Add device_index field to KeyEvent,
populated on Linux via mio Token-based device tracking. Add (device N)
switch condition matching against the originating device's index.

Store current_device on Layout (not in Event row) to preserve the
row 0/1 real/virtual key invariant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@malpern malpern force-pushed the feat/device-identity-pipeline branch from 97bc518 to 01fdc29 Compare March 13, 2026 12:12
@claude

claude Bot commented Mar 13, 2026

Copy link
Copy Markdown

Code Review: feat: thread device identity for per-device switch conditions

This PR adds two independent features: (1) per-device switch conditions via (device N), and (2) per-action require-prior-idle override for tap-hold. Both are useful features, but the PR is large enough that separating them would make review and blame easier in the future.


Critical / Bug-Risk Issues

1. current_device staleness on deferred actions (documented but underweighted)

The PR correctly notes in a code comment that current_device is stale when WaitingState resolves. But this is a correctness issue that affects a primary use case: using (device N) inside a tap-hold's inner switch. The comment reads:

// Note: `current_device` is set here per-event and may be stale when switch actions
// evaluate later (e.g. inside tap-hold's WaitingState resolution from `tick()`).

Consider this scenario:

  • Device 0 presses a tap-hold key → enters WaitingState
  • Device 1 sends a key → current_device is now Some(1)
  • Timeout fires → tap-hold resolves; (device 0) condition evaluates against Some(1)wrong

This should be called out in the config docs, not just in a code comment. Users trying to combine (device N) with tap-hold will get silent, hard-to-debug misbehavior.

2. Device index overflow silently creates aliased devices

// src/oskbd/linux.rs
None => log::warn!(
    "device index overflow: device {path} shares index 255 with another device"
),

After the overflow, next_device_index stays at 255. Every device registered after the 256th will silently get index 255 and share that slot with the 256th device. The (device 255) switch condition will then match events from multiple physical devices. At minimum, next_device_index should saturate and new devices beyond the limit should perhaps be rejected or logged more loudly. This is an edge case, but since index assignment is monotonically increasing (even across hotplug), 256+ device churn over a long session could trigger this unexpectedly.

3. Mutex held during wait_until_ready (macOS)

In src/kanata/macos.rs:

{
    let kanata = kanata.lock();   // <-- lock acquired
    if !kanata.kbd_out.wait_until_ready(Some(Duration::from_secs(10))) {
        // wait_until_ready spins for up to 10 seconds
    }
}

wait_until_ready sleeps 100ms per iteration for up to 10 seconds, holding the Kanata mutex the entire time. Any other thread trying to acquire this lock (e.g., a TCP server command, a layer change request) will be blocked for up to 10 seconds at startup. The lock should be dropped before the waiting loop, or the waiting logic should be refactored to not require the lock.


Design / API Issues

4. crate-type = ["rlib", "staticlib"] — needs justification

# Cargo.toml
crate-type = ["rlib", "staticlib"]

Adding staticlib produces a .a archive at every build. This can noticeably increase build times and output size, and it affects how downstream crates can link the library. If this is for a specific FFI use case (e.g., the macOS DriverKit integration or an upcoming C FFI), that should be documented here. If it's not needed for any current functionality, it should be removed.

5. Sim test device syntax supports only single-digit indices

// src/tests/sim_tests/mod.rs
dev_press
    if dev_press.len() == 2
        && dev_press.starts_with('d')
        && dev_press.as_bytes()[1].is_ascii_digit() =>

The dN:key / uN:key syntax is limited to devices 0–9 (single ASCII digit). Since device_index is a u8 supporting 0–255, and the protocol uses numeric indices, a sim test trying to exercise device 12 would silently fall through to an unrecognized-command error. Consider a format like d12:key or at minimum document the 0–9 limitation clearly.

6. device_index: 0 default in TryFrom<InputEvent> for KeyEvent (linux.rs)

// src/oskbd/linux.rs
impl TryFrom<InputEvent> for KeyEvent {
    fn try_from(item: InputEvent) -> Result<Self, Self::Err> {
        // ...
        Ok(Self {
            code: ...,
            value: ...,
            device_index: 0,  // always 0; caller must call .with_device()
        })
    }
}

The TryFrom impl always sets device_index: 0 and relies on the caller using .with_device(). This is invisible to callers and easy to omit — a new platform port or test helper that forgets .with_device() will silently default to device 0. One alternative: remove device_index from TryFrom and require it to be set explicitly at the call site (.with_device() only, no silent default). Or at minimum, add a #[must_use] to with_device.


Code Quality

7. Dead code DeviceInfo struct

/// Currently unused — scaffolding for device name-based matching (see follow-up issue #6).
#[derive(Debug, Clone)]
#[allow(dead_code)]
pub struct DeviceInfo { ... }

Shipping #[allow(dead_code)] scaffolding in a PR introduces noise and lint suppression. Either open the follow-up issue and remove this, or land it in the follow-up PR where it will be used. The device_info() method has the same issue.

8. device_info() returns unordered results

pub fn device_info(&self) -> Vec<DeviceInfo> {
    self.devices.values().map(...).collect()
}

HashMap::values() iterates in arbitrary order. If any future caller expects devices to be sorted by index (the natural expectation), this will silently produce non-deterministic output. Consider returning a BTreeMap keyed by index, or sorting before returning.

9. release_tracked_output_keys only tracks write_key, not write_raw

In src/oskbd/macos.rs, record_output_transition_after_write is only called from write_key. write_raw (which writes by raw u16 code) also presses and releases keys but doesn't update output_pressed_since. After a recovery, keys pressed via write_raw won't be released, potentially leaving stuck keys.


Minor / Style

10. OpCodeType::Device(u16) stores a u16, but validates as u8

The encoding stores device_idx as u16 but then casts back with device_idx as u8 in evaluate_boolean. This is correct since new_device takes a u8, but the OpCodeType::Device(u16) type is misleading. Consider OpCodeType::Device(u8) for clarity, since the value can never exceed 255.

11. parse_tap_hold_options vs direct parsing in defhands.rs

The new parse_tap_hold_options helper is used consistently in tap_hold.rs, but defhands.rs calls parse_require_prior_idle_option directly and manages its own seen_options set. This inconsistency means any future tap-hold options added to parse_tap_hold_options must also be added manually to defhands.rs. Consider refactoring defhands.rs to delegate to parse_tap_hold_options for the trailing-options section.


What's Well Done

  • The current_device: Option<u8> approach (instead of adding device to keyberon's Event row) correctly preserves the row 0/1 real/virtual key invariant — good architectural decision.
  • The OpCode::new_device encoding is consistent with the existing new_layer / new_base_layer pattern.
  • Test coverage for the OpCode encoding roundtrip, evaluate_boolean with And/Not composition, parser tests, and sim tests is solid.
  • The count_trailing_options approach for backward-scanning optional parameters is clean and doesn't break existing positional-parameter parsing.
  • Cross-platform stub (device_index: 0) strategy is the right call for an incremental rollout.
  • The macOS DriverKit recovery refactor (output_ready / wait_until_ready abstraction) is a worthwhile cleanup, though the mutex concern above should be addressed.

🤖 Generated with Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant