Skip to content

Releases: LocalKinAI/kinclaw

v1.17.0 — Linux Phase 2-5 + Windows Phase 6 (cross-platform port)

12 May 13:55

Choose a tag to compare

First cross-platform release. Linux Phase 2-5 + Windows Phase 6 ports both landed 2026-05-12. macOS remains the primary, tested daily-driver target; Linux and Windows are code-complete but unverified at runtime — see the help-wanted callout below.

🙏 Help wanted — Linux + Windows testers

The agent wrote 2,042 lines of Linux + Windows port code from API docs without hardware available. Builds are green for all 6 OS/arch combos in this release; runtime behaviour needs community validation.

  • 3-command smoke test (≤10 min): TESTING.md
  • Linux reports#1
  • Windows reports#2

Both ✅ and ❌ reports are useful.

What landed (this release)

Layer macOS Linux Windows Backend (non-mac)
screen claw ScreenCaptureKit Linux: grim / scrot / xrandr · Windows: PowerShell + System.Drawing
input claw CGEvent Linux: xdotool / ydotool · Windows: user32.dll P/Invoke + SendKeys
ui claw — window-level AX Linux: xdotool / wmctrl · Windows: user32.dll GetForegroundWindow
ui claw — a11y tree AX Linux: AT-SPI 2 via godbus · Windows: UI Automation 2.0
record claw ScreenCaptureKit Linux: ffmpeg (x11grab / pipewire) · Windows: ffmpeg (gdigrab)
location skill corelocationcli Linux: gdbus + geoclue2 · Windows: Windows.Devices.Geolocation (ipapi.co fallback)
cerebellum/*.sh library 16 cats (478 actions) 4 cats (linux-*) 4 cats (windows-*) shell + POSIX tools / PowerShell
pilot soul 24 skills 23 skills 23 skills

Daily-driver souls:

Binaries

Six static binaries (zero CGO, no runtime deps beyond ffmpeg for the record claw).

Platform Architecture File Size
macOS Apple Silicon (M-series) kinclaw-darwin-arm64 18 MB
macOS Intel kinclaw-darwin-amd64 18 MB
Linux x86_64 kinclaw-linux-amd64 17 MB
Linux ARM64 (Raspberry Pi 4/5, Apple/AWS aarch64) kinclaw-linux-arm64 16 MB
Windows x86_64 kinclaw-windows-amd64.exe 17 MB
Windows ARM64 (Surface Pro X, Snapdragon X) kinclaw-windows-arm64.exe 16 MB

Checksums in SHA256SUMS.

How to verify ✅ a download

# Linux / macOS
shasum -a 256 -c SHA256SUMS --ignore-missing

# Windows PowerShell
Get-FileHash kinclaw-windows-amd64.exe -Algorithm SHA256
# compare to SHA256SUMS line 5

Caveats — known degradations

These the maintainer already knows are imperfect (don't be surprised):

  • Linux app_open_clean skill — not enabled in pilot_linux.soul.md. macOS-specific welcome-modal dismissal pattern.
  • Windows key_down / key_up — degraded. SendKeys auto-releases. True modifier-hold needs raw SendInput; future work.
  • Bothkinclaw probe (AX-tree controllability scorer) is darwin-only on non-darwin it exits with a clear "not supported" message. AT-SPI 2 / UIA equivalents would be a future PR.

Diff since v1.16.0

v1.16.0...v1.17.0

8 commits, +2,200 lines, including:

  • 70bbd18 — Linux Phase 2-4: 4 claws + 4 cerebellum cats + Linux pilot
  • bfe5edf — pilot_linux parity (+5 cross-platform skills)
  • 9fc3c3c — location cross-platform (geoclue2 + Nominatim + IP fallback)
  • 65b07a0 — ui_linux AT-SPI 2 via godbus (Phase 5)
  • d7cee60 — Windows Phase 6: 4 claws + 4 cerebellum cats + Windows pilot + CI
  • 7d16d13 — TESTING.md + Help-wanted banner

Detailed file-by-file CHANGELOG: see CHANGELOG.md entries dated 2026-05-12.

v1.16.0 — paper #11: kinthink grep router (0 LLM tokens on hit path)

12 May 03:55

Choose a tag to compare

Companion release to paper #11Grep-Routed Agents: Bypassing the LLM Tax on Computer-Use Tasks.

End-to-end macbench v0.2 result with this stack: 182/379 (48.0%) in 76 min vs 30.4% / 107 min for the LLM-only baseline. Zero LLM tokens on the Layer-0 hit path (244 of 379 tasks). Web subcategory: 8/10 PASS at 750 ms avg — direct counter to OpenAI Codex Chrome Extension.

What's new

skills/kinthink/ (NEW) — 4-layer NL → cerebellum router (~175 LOC Bash)

Layer Job Median latency
0 Extract Fast path: cerebellum '…' hint from prompt ~6 ms
1 Tokenize + strip path/quote/file-extension literals ~3 ms
2 TF-IDF awk pass over 239-row (NL, cerebellum-call) index ~15 ms
3 Slot substitution (QUOTED / PATH / FILE classes) ~5 ms
4 Dispatch matched action to cerebellum 30 ms – 3 s

Hit-path total: 24 ms router + 30–500 ms cerebellum exec = 50–550 ms end-to-end, 0 LLM tokens.

skills/cerebellum/categories/web.sh (NEW) — 8 actions

Wraps the existing 5 web skills behind one cerebellum 'web …' namespace so the grep router can target them: fetch / fetch_js / screenshot / js / search / scrape / download / session_run.

Soul flags (NEW)

  • cerebellum.exit_on_ok: true — terminate chatLoop when a tool returns ok:. Saves 5–10 s per task.
  • cerebellum.grep_route: true — invoke kinthink BEFORE the chatLoop. On hit, execute the matched action and return without ever calling the LLM.

Both enabled in souls/macbench.soul.md.

Calendar cerebellum hardening (calendar 22% → 40%)

  • 4 new actions: confirm, wait_sync, switch_view, find_event_ymd
  • 3× retry loop on find_event_hhmm, find_event_ymd, find_events_with_summary to dodge iCloud cold-start race
  • iCloud sync sleeps bumped across 10 mutating actions

Wi-Fi safety guard

cerebellum settings toggle_wifi now refuses OFF requests. Defense in depth after a v0.1 run inadvertently disabled Wi-Fi mid-bench.

Try it

go install github.com/LocalKinAI/kinclaw/cmd/kinclaw@v1.16.0

kinclaw -soul souls/macbench.soul.md -exec "rename foo.txt to bar.txt"
# → 53 ms total, 0 LLM tokens

Read the paper

v1.15.0 — macbench integration: 67.3% on first benchmark

09 May 06:30

Choose a tag to compare

First public benchmark — kinclaw on macbench v0.1: 67.3% IMPLEMENTED.

This release wires kinclaw to macbench, the macOS-native computer-use benchmark released alongside this version.

Headline numbers

```
kinclaw v1.15.0 + Kimi-K2.5(cloud) on macbench v0.1
IMPLEMENTED: 101 / 150 = 67.3%
STRICT: 101 / 369 = 27.4%
```

For context, Anthropic Computer Use scores ~38% on OSWorld (Linux desktop). macbench measures a different surface, so the numbers aren't directly comparable, but the methodology + scoring discipline are the same.

Added

  • `souls/macbench.soul.md` — single-task benchmark soul. No memory, no spawn, no network, 8 skills (5 claws + file_read/write/edit + app_open_clean), temperature 0.1. `make bench` now uses this soul by default (was incorrectly using `pilot.soul.md` which caused cross-task memory pollution).
  • `scripts/warmup.sh` — six-probe pre-flight check: build + sign, codesign identifier, Accessibility TCC, Screen Recording TCC, brain reachability, sibling-kit availability.
  • `make warmup` target. `make bench` auto-runs warmup; `SKIP_WARMUP=1` bypasses for fast dev iteration.
  • `benchmarks/` catalog with feasibility matrix + designs for WebArena and OSWorld vision-only-mode adapters (designed only, not implemented).

Changed

`make bench` default soul changed from `pilot.soul.md` to `macbench.soul.md`. Behavioral change for direct `make bench` users: agent now operates in single-task mode without cross-task memory or spawn. KinClawMac.app daily Cowork pilot (`pilot.soul.md`) is unchanged.

Bugs caught + fixed during the multi-hour debugging session that produced 67.3%

Real (fixed in this release):

  1. Cross-task memory pollution when `make bench` used `pilot.soul.md`
  2. macbench runner v0 only ran eval if exec exited cleanly (fixed in macbench)
  3. Mid-run AppleScript app degradation — Notes/Calendar/Reminders hang after ~5-10 invocations even on warm-started apps (fixed by macbench runner per-task PID-snapshot isolation)

False positives (caught + dismissed):

  • "kinclaw is bad at Notes" — was actually `${VAR,,}` bash-4-only syntax in eval scripts (macOS still ships bash 3.2). Fixed in macbench by switching to `tr`.
  • "Safari TCC denied" — was a transient state, recoverable by `make warmup`. Documented in macbench README.

Known limitations carried into v1.16

  • sudo-requiring tasks (firewall, login window) cause an interactive Password: prompt that hangs the bench. macbench soul should add `sudo: false`.
  • Cross-brain comparison (Claude Sonnet 4.5, GPT-4o, etc.) requires a brain switcher in macbench soul that doesn't yet exist.

See `CHANGELOG.md` for the full release notes.