Black-box autonomous web app testing, driven by Claude Code.
Point Recon at a URL. It opens the site in Playwright, discovers core user flows (sign-up, login, search, forms, navigation, CRUD), executes them, and produces a self-contained HTML report plus a Notion-friendly Markdown summary — with screenshots, traces, console/network errors, and AI-written failure analysis.
Recon is one-shot per URL. It is not a CI runner, not a cron job, not a monitoring tool. Run it when you want a fresh black-box pass on a site.
Recon is packaged as a Claude Code skill. You invoke /recon <url> inside Claude Code and the model:
- Starts a local Playwright HTTP server (
scripts/browse-server.mjs). - Snapshots the landing page (DOM accessibility tree + screenshot) and plans 3–7 candidate flows.
- Executes each flow step-by-step, driving the browser via HTTP (
/snapshot,/act,/goto). - Writes structured JSON results to
~/recon-runs/<timestamp>-<host>/. - Renders an HTML report (
scripts/report.mjs) and a Markdown summary.
There is no separate LLM API key required — the planning and acting loop runs inside Claude Code, so it's covered by your Claude subscription.
- macOS or Linux
- Node.js 20+
- Claude Code installed and signed in
- Playwright browsers (installed on first run, or run
npx playwright install chromium)
git clone https://github.com/Gary956/recon.git ~/.claude/skills/recon
cd ~/.claude/skills/recon
npm install
npx playwright install chromiumThe skill must live at ~/.claude/skills/recon/ so Claude Code can discover it.
In any Claude Code session:
/recon https://example.com
Optional arguments:
| Flag | Default | Purpose |
|---|---|---|
--auth user@example.com:password |
— | Credentials for login-gated flows |
--headed |
headless | Show the browser window |
--max-flows N |
5 | Cap number of flows |
--max-steps-per-flow N |
20 | Cap steps inside a single flow |
--allow-destructive |
off | Allow clicks on delete / checkout / pay buttons |
--state path/to/storageState.json |
— | Reuse a saved Playwright session |
Example with auth:
/recon https://app.example.com --auth me@example.com:hunter2 --max-flows 3
Each run creates ~/recon-runs/<YYYY-MM-DD-HHMMSS>-<hostname>/ containing:
report.html— self-contained, open in any browserREPORT.md— paste-into-Notion summaryreport-data.json— structured run datascreenshots/00NN.png— one per steptrace.zip— Playwright trace (open withnpx playwright show-trace)video.webm— full session videoplan.json,steps.json,initial-snapshot.json— raw artifacts
Recon refuses to click anything matching the destructive-action blocklist (delete, remove, deactivate, pay, checkout, etc.) unless you pass --allow-destructive. It will not solve CAPTCHAs, will not paste real credentials anywhere besides login fill calls, and will not extend past the step / time budget.
- Requires Claude Code — there is no standalone CLI. The planning loop lives in
SKILL.mdand is executed by the model. - Single-tab Playwright session. Multi-tab flows (OAuth popups, payment iframes from third-party origins) are not fully supported.
- Generates obviously-test data (
recon+<random>@example.com,Recon Test, etc.). Don't run it against production accounts you can't clean up.
MIT