Shipyard

Autonomous code factory. Issues go in, PRs come out.

How It Works

Picks the next task file from tasks/
Routes to the repo (local, GitHub, or creates a new one)
Claude codes it, runs tests, commits, opens a PR
Moves the task file to tasks/done/

Setup

Shipyard expects your repos to live in the parent directory:

projects/
  shipyard/        # this repo
  my-app/          # a repo shipyard can work on
  another-app/     # another repo

If a repo isn't local, Shipyard searches your GitHub account and clones it automatically.

Or set SHIPYARD_PROJECTS to point elsewhere:

export SHIPYARD_PROJECTS="$HOME/code"

Usage

./factory.sh                              # run one task (Claude by default)
./factory.sh --dry-run                    # preview what it would pick
./factory.sh --parallel 3                 # run 3 tasks in parallel
./factory.sh --issues owner/repo          # pull GitHub issues into tasks/
./factory.sh --verify owner/repo          # re-verify all open PRs
./factory.sh --verify owner/repo 42       # re-verify a specific PR

SHIPYARD_AGENT=dotbot ./factory.sh        # use dotbot instead of Claude
SHIPYARD_AGENT=dotbot SHIPYARD_PROVIDER=anthropic ./factory.sh  # dotbot + Anthropic

Run in its own terminal — not inside another tool. Monitor progress in a second terminal:

tail -f logs/*.log

Cancel anytime with Ctrl+C — Shipyard cleans up the branch and returns to the default branch.

Task Format

Each task is a markdown file in tasks/. The filename becomes the task name. The file body is the full prompt sent to Claude — write as much or as little as you need.

Existing repo — add repo: in frontmatter:

---
repo: my-app
---

Add a dark mode toggle to the settings page. Should respect system
preference by default. Use the existing ThemeProvider context.

Screenshot verification — if agent-browser is installed and the project has a dev/start/preview script, Shipyard starts the dev server after shipping, reads the git diff to figure out which pages were affected, and uses Claude + agent-browser to take targeted screenshots of the changes. Screenshots are committed to the branch and commented on the PR.

New repo — omit repo: and Shipyard creates one (named from the filename):

Build a weather dashboard that shows 5-day forecast.
Use OpenWeatherMap API. Include a search bar for city lookup.

Tasks run in alphabetical order by filename. Prefix with numbers to control priority:

tasks/
  01-fix-auth.md           ← runs first
  02-add-dashboard.md
  03-refactor-api.md

Completed tasks move to tasks/done/.

GitHub Issues

Pull issues from any repo into your task queue:

./factory.sh --issues owner/repo

This fetches open issues labeled shipyard and creates task files from them. After the factory completes a task, it comments the PR link on the issue and closes it.

Schedule

Shipyard is designed to run unattended. Point cron at it and your issues get solved while you sleep.

Every hour — process one task from the queue:

0 * * * * /path/to/shipyard/factory.sh >> /path/to/shipyard/shipyard.log 2>&1

Every hour — pull new GitHub issues, then process them:

0 * * * * /path/to/shipyard/factory.sh --issues owner/repo >> /path/to/shipyard/shipyard.log 2>&1

Nightly batch — run 5 tasks in parallel at 2am:

0 2 * * * /path/to/shipyard/factory.sh --parallel 5 >> /path/to/shipyard/shipyard.log 2>&1

Label a GitHub issue shipyard, go to bed, wake up to a PR with screenshots. That's the workflow.

Factory Features

Based on patterns from Ramp Inspect and Stripe Minions:

Task queue — tasks/ folder, one markdown file per task
Task routing — finds repo locally, clones from GitHub, or creates new
Branch isolation — agents work on feature branches, never the default branch
Autonomous coding — agent runs non-interactively (Claude or dotbot)
Test verification — run tests, fail fast if broken
PR creation — open a PR via gh CLI for every task
CI gate — auto-generates GitHub Actions workflow, watches CI, fixes failures
Task completion — move task file to tasks/done/
Visual verification — targeted screenshots of changes via agent-browser
Streaming output — real-time Claude session output via stream-json
Parallel execution — run multiple tasks concurrently with --parallel N
Logging — timestamped logs per run for debugging
Scheduling — cron or trigger to run without you

Configuration

A single file controls the factory: factory.md at the repo root. A Dockerfile for code factories — all the standards an autonomous agent needs to ship code in a repo, in one file you can clone and run anywhere. See the factory.md spec for the full format.

factory.md has 8 reserved sections. Each section is a bullet list of rules.

#	Section	Covers
1	`## style`	Formatting, naming, function size, imports, changelog hygiene
2	`## build`	Runtime, package manager, CI workflow, version bumping
3	`## testing`	Test framework, pass/fail gates, new-code test requirements
4	`## documentation`	Doc comments, README, AGENTS.md updates
5	`## environment`	Dev tools, branching rules, worktrees
6	`## quality`	File size, function size, TODO/FIXME, complexity
7	`## observability`	Logging, error reporting, tracing
8	`## security`	Hardcoded credentials, dangerous patterns, dependencies

Every bullet is one rule. The framework reads each bullet and either:

Runs it as a gate if it recognizes the rule (e.g. "no secrets in diff", "tests pass")
Forwards it to the agent as an additional rule to honor if it doesn't recognize it

Prefix a bullet with ! to mark it strict — the framework must verify it deterministically or the pipeline fails. Use strict for security, correctness, and release-critical rules you refuse to trust a model on:

## security
- ! No hardcoded credentials
- ! No eval
- Dependency audit clean

Edit any section to match your preferences. factory.md is framework-agnostic — the same file can drive any autonomous agent pipeline, not just Shipyard.

Pipeline

Shipyard's pipeline is an implementation detail of factory.sh:

Pick the next task from tasks/
Route it to a repo (local, GitHub, or new)
Prepare a feature branch (worktree)
Scaffold a CI workflow if missing
Run the agent with every factory.md rule injected into the prompt
Dispatch every rule bullet through check_gate; recognized gates run as checks, unrecognized gates are forwarded
Fix gate failures by re-engaging the agent (max 2 attempts)
Confirm the PR, watch CI, fix failures (max 2 attempts)
Screenshot affected pages via agent-browser
Move the task file to tasks/done/, close the issue, return to the default branch

Requirements

Claude Code or dotbot
gh CLI (authenticated)
agent-browser (optional, for screenshot verification)

Why Shipyard

Shipyard does the same thing as GitHub Copilot Coding Agent and Claude for GitHub — task in, PR out, automated. The difference is it's a shell script you own.

What Shipyard has that they don't

Task queue with priority — file-based, numbered for order, not one-off prompts
Configurable standards and workflow — edit factory.md (a portable, framework-agnostic spec) to control exactly what the agent does
Screenshot verification — starts the dev server, reads the diff, screenshots the actual pages that changed
Runs locally — no data leaves your machine except API calls
Swappable agent — Claude Code or dotbot (any provider: xAI, Anthropic, OpenAI, Ollama)
GitHub issues integration — pull labeled issues into the queue, close them on completion
No vendor lock-in — swap Claude for another model, change the pipeline, fork it

What they have that Shipyard doesn't

Hosted infrastructure (no local machine needed)
Web UI
No setup

Who is Shipyard for

Developers who want to own their code factory. Same idea as self-hosting vs SaaS — you trade convenience for control.

Contributing

git clone https://github.com/stevederico/shipyard
cd shipyard

Edit factory.sh or factory.md. Open a PR.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
docs		docs
tasks/done		tasks/done
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
factory.md		factory.md
factory.sh		factory.sh
headline.jpg		headline.jpg
headline.webp		headline.webp
package.json		package.json
todo.md		todo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shipyard

How It Works

Setup

Usage

Task Format

GitHub Issues

Schedule

Factory Features

Configuration

Pipeline

Requirements

Why Shipyard

What Shipyard has that they don't

What they have that Shipyard doesn't

Who is Shipyard for

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shipyard

How It Works

Setup

Usage

Task Format

GitHub Issues

Schedule

Factory Features

Configuration

Pipeline

Requirements

Why Shipyard

What Shipyard has that they don't

What they have that Shipyard doesn't

Who is Shipyard for

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages