Releases: LocalKinAI/kinbrowser
v0.2.1 — bumped timeouts + anti-bot wall detection
Three production failures from a real KinClaw session, three small fixes.
Bugs
| Site | Failure | Fix |
|---|---|---|
| CNN | L3 chromedp 20s timeout too short | Bumped to 30s |
| weather.com | L1 5s timeout too aggressive | Bumped to 10s |
| Google search | Bot wall returned as "content" | Detect + escalate + clean-fail |
Worst-case full escalation now ~50s (10+10+30), still under CLI 90s ceiling.
Anti-bot wall detection
Added 11 patterns to `jsRequiredStubs` so kinbrowser can recognize challenge pages (HTTP 200 + prose) and escalate / fail rather than feeding the LLM the challenge text as content.
Patterns include Google's "unusual traffic", Cloudflare's "checking your browser" / "verifying you are human", generic "access denied" / "rate limit exceeded" / "too many requests" / "captcha" / "sorry, you have been blocked".
Final L3 acceptance check now also rejects bot walls after chromedp — Google fingerprints headless Chrome, so even L3 can return the wall. Better to fail clean with hint than feed LLM a challenge page.
Verified
```bash
$ kinbrowser open "https://www.google.com/search?q=test\"
kinbrowser: all 3 backends returned stub/anti-bot wall for ...
(URL likely blocks automation; try a different source, or use
web_search for search queries)
```
Tests: 18/18 pass.
Lesson
Anti-bot defense is asymmetric. Google specifically fingerprints chromedp. No amount of UA spoofing fully fools them. Best response: detect + surface, not feed.
🤖 Generated with Claude Code
v0.2.0 — six audit fixes (PDF / Chrome profile / canon / timeouts / daemon CLI / diff)
Six audit issues addressed in one tag. User asked "what else needs improvement?" after v0.1.2 — turned into a P0/P1 sweep, all in one tag for narrative.
🔴 P0 — Production blockers
1. PDF support
`arxiv.org/pdf/...` is the swarm's #1 reading source; was useless before. Two backends:
- `pdftotext` (poppler) when on PATH — proper word spacing, multi-column
- `github.com/ledongthuc/pdf` pure-Go fallback (with word-boundary regex)
```
$ kinbrowser open https://arxiv.org/pdf/1706.03762
[kinbrowser] L1 http | 1706.03762 | 61878 chars | 1.05s
PDF source — 19 pages from ... (extracted via pdftotext)
Attention Is All You Need ...
```
Install poppler for best quality: `brew install poppler`
2. Chrome profile persistence (L3)
`chromedp.UserDataDir(~/.kinbrowser/chrome-profile)` — cookies + login state survive across kinbrowser invocations. X authenticated views / Substack paid / LinkedIn etc. work on revisit. Per-agent isolation via `WithChromeProfile(dir)` or `$KINBROWSER_CHROME_PROFILE` env.
3. URL canonicalization
Strip 17 known tracking params (`utm_*`, `fbclid`, `gclid`, `mc_eid`, `_ga`, `ref`, `igshid`, `si`, `spm`, `yclid`, ...) + drop fragments + normalize ports/case/trailing-slash. Cache hit rate improves on agents reading the same article from multiple referrer-tagged links.
🟡 P1 — Quality of life
4. Per-layer timeout
`cdpTimeout` split into:
- `httpTimeout` 5s (fast fail)
- `lightpandaTimeout` 10s
- `chromedpTimeout` 20s
Worst-case full escalation now ~35s instead of unbounded.
5. `kinbrowser daemon` subcommands
```
kinbrowser daemon status # pid, uptime, CDP URL
kinbrowser daemon start # explicit start
kinbrowser daemon stop # kill spawned lightpanda
```
6. Diff-on-revisit
Closes a v0.1.0 design promise. When archive hits, fetch fresh + reconcile:
- ≥85% line similarity → `[UNCHANGED since archive]` note (LLM skips re-read)
- Drift → `[CHANGED]` + unified diff block + new content
- Rewrite → `[CHANGED]` warning + fresh content
Stats
- +700 LoC since v0.1.2 (3 new files + 4 modified)
- 16/16 tests pass (was 8/8)
- Total package: ~1,939 LoC + tests
Install
```bash
go install github.com/LocalKinAI/kinbrowser/cmd/kinbrowser@v0.2.0
Recommended deps:
brew install lightpanda-io/browser/lightpanda # L2 SPA speedup
brew install poppler # PDF quality
```
🤖 Generated with Claude Code
v0.1.2 — auto-detect + auto-spawn Lightpanda
v0.1.1 required `--lightpanda ws://...` flag AND manually running `lightpanda serve`. v0.1.2 makes Layer 2 zero-ceremony.
Behavior
When Open() needs to escalate from L1, kinbrowser now:
- Checks `$KINBROWSER_LIGHTPANDA_URL` env (caller-managed daemon)
- Probes `127.0.0.1:9222` for existing daemon
- If `lightpanda` binary on PATH, auto-spawns `lightpanda serve --host 127.0.0.1 --port 9222`
- Otherwise falls through to L3 (no error, just slower)
Result: `kinbrowser open https://x.com\` with no flag = L2 fires automatically.
Setup (one-time)
```bash
brew install lightpanda-io/browser/lightpanda
```
If you skip this, kinbrowser still works via L1 → L3 chromedp (slower for SPA reads, otherwise identical).
Performance
| Layer | x.com | arxiv |
|---|---|---|
| L3 chromedp | ~6.7s | ~2.5s |
| L2 lightpanda | ~1.6s | ~1.6s |
~4× speedup on the L2-eligible portion of agent traffic.
Install
```bash
go install github.com/LocalKinAI/kinbrowser/cmd/kinbrowser@v0.1.2
brew install lightpanda-io/browser/lightpanda # optional, for L2 speedup
kinbrowser open https://x.com # L2 fires automatically
```
🤖 Generated with Claude Code
v0.1.1 — L1 stub detection + --force-layer
Fixes a real bug found within hours of v0.1.0 shipping: L1 was accepting JS-required stub pages as "good enough" content, never escalating to L3.
Bug
`https://x.com\` returns a 222-char "Something went wrong, but don't fret" stub at L1. Default `minMarkdown` threshold was 200, so L1 wrongly accepted it and never escalated. Same shape would fail on instagram.com, Cloudflare-challenged sites, create-react-app placeholders.
Fix
1. Stub pattern detection in `acceptable()` — known CSR-stub phrases force L1 rejection regardless of length:
```go
var jsRequiredStubs = []string{
"enable javascript",
"javascript is required",
"javascript is disabled",
"please enable cookies",
"something went wrong, but don", // x.com / twitter
"you need to enable javascript to run", // create-react-app default
...
}
```
2. `--force-layer N` flag for diagnostic testing. Bypasses caches + escalation; runs exactly one backend. Lets you prove each layer works in isolation.
Verification
```
$ kinbrowser open https://x.com
[kinbrowser] L3 chromedp | X. It's what's happening | 369 chars | 6.7s ✓
```
L1 stub correctly rejected → L3 chromedp fires → real content extracted.
```
$ kinbrowser open --force-layer 3 https://arxiv.org/abs/1706.03762
[kinbrowser] L3 chromedp | Attention Is All You Need | 1963 chars | 2.5s ✓
```
L3 proven standalone on a known-good URL.
Tests
8/8 pass (added 2: stub detection + force-layer pinning).
Install
```bash
go install github.com/LocalKinAI/kinbrowser/cmd/kinbrowser@v0.1.1
```
🤖 Generated with Claude Code
v0.1.0 — initial release
Markdown-native browser for AI agents. Chrome renders pixels you don't read; kinbrowser extracts markdown you do.
Three escalating backends, one markdown output
```
L1 http.Get + go-readability + html-to-markdown ~100ms ~80% of sites
L2 Lightpanda over CDP (JS exec, no rendering) ~500ms +15% (opt-in)
L3 chromedp / full Chrome via CDP ~2s +5%
```
Same markdown shape regardless of which layer ran — agents see one contract, backends are transparent escalation.
Verified on real URLs
| URL | Layer | Time | Output |
|---|---|---|---|
| arxiv 1706.03762 (Attention paper) | L1 | 183 ms | 1.9 KB clean abstract |
| github.com/lightpanda-io/browser README | L1 | 965 ms | 13.2 KB |
| Hacker News front page | L1 | 267 ms | 15.8 KB / 30 posts |
Install
```bash
go install github.com/LocalKinAI/kinbrowser/cmd/kinbrowser@v0.1.0
kinbrowser open https://arxiv.org/abs/1706.03762
```
Memory
- Session LRU for same-process revisit avoidance
- Optional KinBrain archive — `Archive(url)` writes to `~/.kinbrain/notes/` for persistent agent memory; `Open(url)` checks previously-archived URLs
KinBrain integration is opt-in — `Open()` never auto-writes to KinBrain (keeps your curated knowledge base free of transient agent reads).
🤖 Generated with Claude Code