A local-first database migration tool for PostgreSQL (including managed Postgres such as AWS RDS). Connect a source and destination, compare their schemas, map datatypes, move the data, and verify parity. The connector and introspection layers are pluggable — Postgres is supported today, and additional engines (MySQL is next) slot in without touching the rest of the app.
It runs on your laptop, never sends your database credentials anywhere, and lets you:
- Connect to source + destination databases (Postgres today; more engines being added).
- Introspect each schema (tables, columns, datatypes, defaults, FKs, indexes, RLS policies, extensions).
- Compare source vs. destination side-by-side at the schema level — with per-schema scoping.
- Map datatypes from source to destination, with sensible defaults and per-column overrides.
- Migrate the data via streaming binary
COPYwith per-table conflict modes. - Verify parity with row counts + hash sampling + sequence checks (also runs as a standalone tool).
If you just want to run DataMETL — not develop it — this is the path:
curl -fsSL https://github.com/sbcsp/datametl/releases/latest/download/install.sh | bashThe installer:
- Verifies Docker is installed
- Creates a
datametl/directory in your CWD - Downloads the latest release's
docker-compose.yml+env.example - Generates a fresh Fernet encryption key into
.env - Drops a small
Makefileso day-to-day commands match the dev workflow - Pulls the multi-arch images from
ghcr.io/sbcsp/datametl-{backend,frontend} - Brings the stack up
Open http://localhost:3000. From the install directory:
make down # stop the stack (data persists in named volumes)
make update # pull latest images + restart
make logs # tail logs
make help # full target listA specific version:
DATAMETL_VERSION=v0.2.0 INSTALL_DIR=~/datametl bash <(curl -fsSL https://github.com/sbcsp/datametl/releases/v0.2.0/download/install.sh)The deploy compose only exposes one host port (FRONTEND_PORT, default 3000). The frontend container proxies API calls to the backend container internally — no CORS, no port-juggling.
- Backend: Python 3.12 + FastAPI, managed with uv
- Frontend: Next.js (App Router) + TypeScript + shadcn/ui + Tailwind + TanStack Query
- App metadata DB: Postgres 16
- Job queue: arq on Redis
- Everything runs in docker-compose — no host-level Python or Node required.
git clone https://github.com/sbcsp/datametl.git && cd datametl
cp .env.example .env
make key # generate a Fernet key, paste into ENCRYPTION_KEY in .env
make up-samples # app + sample source & destination Postgres databases
make migrate # run alembicThen open:
- Frontend: http://localhost:3000
- Backend OpenAPI: http://localhost:8000/docs
Sample DB credentials are in .env.example (SAMPLE_SOURCE_PASSWORD, SAMPLE_DEST_PASSWORD). Connect to them from the UI as your "source" and "destination" connections.
Run make help for the full list. Highlights:
| Task | Command |
|---|---|
| Generate a Fernet key | make key |
| Start dev stack (always rebuilds) | make up |
| With sample DBs | make up-samples |
| Apply migrations | make migrate |
| Tail logs | make logs |
| Run backend tests | make test |
| Cut a release (publish to GHCR) | make release v=v0.2.1 |
DataMETL ships as multi-arch container images on GHCR plus a downloadable docker-compose.yml + install.sh attached to each GitHub release.
make release v=v0.2.1That tags v0.2.1, pushes the tag, and the .github/workflows/release.yml workflow:
- Builds
linux/amd64+linux/arm64images for backend + frontend - Publishes them to
ghcr.io/sbcsp/datametl-{backend,frontend}:{v0.2.1,latest} - Stages the deploy compose with version pinned and attaches it to the GitHub release alongside
install.sh
End users then run the one-liner above and pull the freshly-published images.
See CLAUDE.md for the architectural overview that future Claude Code sessions use.
