Skip to content

bacoco/ShipGuard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

110 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ShipGuard

ShipGuard β€” Agentic AI Audit & Visual Regression

Ship with confidence. ShipGuard finds bugs before your users do.

Three AI-powered modules. Use one, two, or all three. No test files to write.

πŸ“Έ Visual E2E Debugger

Auto-discover routes, generate tests, mark bugs on screenshots β€” AI traces to source code and fixes automatically.

/sg-visual-run

🎬 Macro Recorder

Record your browser interactions and turn them into replayable tests. Like Excel's macro recorder, but for visual testing.

/sg-record http://localhost:3000

πŸ” Code Audit

Parallel AI agents scan your entire codebase, find bugs, and fix them automatically. Race conditions, auth gaps, silent exceptions, resource leaks.

/sg-code-audit

Install

claude plugin add bacoco/shipguard
npm install -g agent-browser && agent-browser install --with-deps

Smart Annotations Code Audit Dashboard

⚠️ Token Usage β€” Code audits are token-intensive. standard (10 agents) β‰ˆ 2M tokens. deep (15 agents, 2 rounds) β‰ˆ 5M+. paranoid (20 agents, 3 rounds) can exceed 10M.


Visual E2E Debugger

Mark bugs directly on screenshots. The AI traces each annotation to source code and fixes it.

Visual Tests β€” Screenshot Grid

/sg-visual-run I changed the sidebar

Commands

Command What it does
/sg-visual-discover Scan codebase, generate YAML test manifests per route
/sg-record <url> Record browser interactions as replayable test manifests
/sg-visual-run [what] Execute manifests β€” natural language or flags
/sg-visual-review Launch interactive screenshot review dashboard
/sg-visual-fix Auto-fix bugs annotated in the review dashboard
/sg-visual-review-stop Stop the review server

Smart Annotations (Gemini-style)

The review dashboard uses draggable annotation cards to mark visual bugs on screenshots. Click anywhere on a screenshot to place a pin, then describe the problem.

How it works:

  1. Open a screenshot in the lightbox
  2. Double-click anywhere on the image β€” a pin appears instantly (or click + Add Note first)
  3. Click = point pin. Drag = rectangle zone selection (highlights the problem area)
  4. Choose severity + type your note β†’ a card appears connected to the pin
  5. Drag the pin to reposition β€” zone, card, and leader line all move together
  6. Drag the card separately to reposition just the label
  7. Double-click a card to edit text/severity, click X to delete
  8. Click Validate & Generate Report when done β†’ produces fix-manifest.json with zone coordinates
  9. Run /sg-visual-fix β†’ AI reads your annotations + zone coords, traces to source code, fixes automatically

Severity colors:

Color Level When to use
πŸ”΄ Red Critical Broken layout, missing content, crash
🟠 Orange High Wrong alignment, color mismatch, bad spacing
πŸ”΅ Blue Medium Minor visual inconsistency, polish needed
βšͺ Gray Info Suggestion, not a bug

Smart Annotations β€” Draggable Cards

sg-visual-run options

/sg-visual-run                                  # Interactive β€” choose scope
/sg-visual-run I changed the sidebar, check it  # Natural language
/sg-visual-run --from-audit                     # Test audit-impacted routes
/sg-visual-run --regressions                    # Re-run previously failed tests
/sg-visual-run --all                            # Full suite

--from-audit reads impacted_routes from audit-results.json β€” a natural bridge between the two features.

Discover options

/sg-visual-discover                    # Current project
/sg-visual-discover --all              # Full discovery
/sg-visual-discover --refresh-existing # Regenerate existing manifests

Supports Next.js (App Router & Pages Router), React Router, Vue, Angular.


Macro Recorder

Record what you do in the browser and turn it into a replayable test. Like Excel's macro recorder, but for visual testing.

Recorded Tests β€” Test Library

/sg-record http://localhost:3000/dashboard --name my-test

How it works

  1. Launch β€” Opens a Playwright browser with a floating toolbar
  2. Navigate β€” Browse your app normally. Clicks, inputs, uploads are captured automatically
  3. Check β€” Click the Check button, then click an element to mark it as an assertion
  4. Undo / Delete / Pause β€” Fix mistakes without restarting
  5. Stop β€” Saves a YAML manifest ready for /sg-visual-run

Test Library

Recorded tests appear as cards in the review dashboard under the Recorded Tests tab.

Recorded Tests β€” Selected for Run

Select the tests you want to run, click Run β€” the command is ready to copy.

Recorded Tests β€” Run Command

Two ways to create tests

/sg-visual-discover /sg-record
Source AI scans your code Human records interactions
When After code changes After manual QA, bug reproduction, new feature walkthrough
Output Same YAML format Same YAML format

Both feed into the same pipeline: sg-visual-run executes them, sg-visual-review shows results, sg-visual-fix fixes failures.

Options

/sg-record http://localhost:3000                              # Interactive β€” asks for name on stop
/sg-record http://localhost:3000 --name login-flow            # Preset name
/sg-record http://localhost:3000 --storage auth.json          # Skip login (reuse saved auth)
/sg-record http://localhost:3000 --save-storage auth.json     # Save auth for future recordings

Code Audit

Dispatch parallel AI agents to audit your entire codebase. Each agent reviews a non-overlapping zone, finds bugs, fixes them, and produces structured JSON. Watch progress in real-time on the Mission Control dashboard.

/sg-code-audit deep

Modes

Mode Agents Rounds Coverage
quick 5 1 Surface scan
standard 10 1 Full codebase (default)
deep 15 2 Surface + runtime behavior
paranoid 20 3 Surface + runtime + edge cases & security

Multi-round depth

  • R1 β€” Null refs, missing guards, type mismatches
  • R2 β€” Race conditions, async pitfalls, state management
  • R3 β€” Edge cases, injection, auth bypass, data leaks

Smart Scope

By default, ShipGuard detects what changed and asks whether to limit the audit:

/sg-code-audit       # "12 files changed since main. Audit only what changed?"

Override with flags:

Flag Effect
--all Force full scope, skip the question
--diff=<ref> Use a specific base reference
--focus=path/ Restrict to a directory
--report-only Find bugs but do not fix them

Flags combine freely: /sg-code-audit deep --focus=src/ --report-only

Live Dashboard

At startup, the audit offers to open the Mission Control dashboard. The Code Audit tab shows real-time agent pods (running/done/pending), severity heatmap, bug table filterable by severity and free-text search. Polls every 3s during active audit.

Code Audit β€” Bugs filtered by Critical

Output

Results are written to audit-results.json:

  • summary β€” totals by severity and category
  • bugs[] β€” file, line, severity, description, fix status
  • impacted_routes[] β€” UI routes affected (consumed by /sg-visual-run --from-audit)

Supported languages

Python, TypeScript/React, Next.js, Infrastructure (Docker/YAML/CI), Go, Rust, JVM.


Compatibility

Built for Claude Code. Partial support for other AI CLIs:

Feature Claude Code Codex CLI / Gemini CLI
Code Audit (parallel) βœ… Full ❌ Requires Agent tool
Visual E2E Debugger βœ… Full βœ… agent-browser is CLI-independent
Macro Recorder βœ… Full βœ… Playwright is CLI-independent
Review Dashboard βœ… Full βœ… Pure Node.js
Visual Discover/Fix βœ… Full βœ… Bash + LLM prompts

The visual testing pipeline works with any AI CLI that can run shell commands and read/write files. Code audit parallelization requires Claude Code's Agent tool with worktree isolation.

Community adapters welcome.


Quick Start

# Install
claude plugin add bacoco/shipguard
npm install -g agent-browser && agent-browser install --with-deps

# Audit your code
/sg-code-audit

# Record a test manually
/sg-record http://localhost:3000

# Run all tests
/sg-visual-run

Configuration

Create visual-tests/_config.yaml:

base_url: "http://localhost:3000"
credentials:
  username: "testuser"
  password: "testpass"
build_command: "docker compose up -d --build frontend"  # optional

License

MIT

About

Claude Code skills for automated E2E testing with agent-browser (Playwright CLI). Discover routes, generate YAML test manifests, execute with hybrid LLM assertions, track regressions.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors