S3 compatible object storage in a single binary. Stuff it in, pull it out later.
Hamster is a self hosted, S3 compatible object store built around one idea: object storage should be simple to run and safe with your data, without forcing you into a heavyweight distributed system or a restrictive license.
Status: early development (v0). Not production ready. The design is settled and the core is being built in the open. Please do not trust real data to Hamster yet. Star or watch the repo to follow progress toward v1.
The self hosted S3 landscape shifted in 2026 when MinIO archived its community edition and steered users toward a commercial product. The open source stores that remain are good software, but they cluster at two ends: feature-rich systems that bring real operational weight, and admirably simple ones that leave out the features regulated data can't live without.
Hamster aims for the missing middle:
- The simplicity of a single binary you can run anywhere.
- The durability of erasure coding, so storage stays cheap without giving up safety.
- The compliance features that simpler stores skip: versioning, object lock, and WORM retention — the controls that retention and audit regimes (think HIPAA or SEC 17a-4) actually ask for.
- A permissive Apache 2.0 license, so you can build on it without legal friction.
- Single binary, no external dependencies. Run it on a laptop, a VPS, or a cluster. No ZooKeeper, no etcd, no separate database to operate.
- S3 compatible. Works with existing S3 SDKs, CLIs, and tools.
- Durable by default. Reed Solomon erasure coding spreads each object across independent failure domains, so you can lose drives or whole nodes without losing data.
- Grows smoothly. Partitioned placement rebalances as you add capacity. Add a node, and data redistributes without reshaping the cluster.
- Safe to upgrade. Versioned on disk and on wire formats with a backwards compatible upgrade path and zero downtime rolling upgrades, validated by end to end upgrade tests.
- Trustworthy. Durability and consistency are exercised under a deterministic simulation harness that injects partitions, disk failures, and reordering, so correctness is tested rather than hoped for.
High level and honest: a check mark means shipped and tested, not promised. Versions beyond that are the roadmap's plan and may shift as the code pushes back.
| Version | Features | Status |
|---|---|---|
| v0.1 |
|
✅ |
| v0.2 | Clustering — Raft-replicated metadata, mTLS between nodes, token-based join | ✅ |
| v0.3 | Erasure-coded durability with self-healing repair, the S3 endpoint served from the cluster | ✅ |
| v0.4 | Partitioned placement and online rebalancing | 🚧 in progress |
| v0.5 | Object versioning | planned |
| v0.6 | Object lock and WORM retention (GOVERNANCE and COMPLIANCE modes) | planned |
| v0.7 | Encryption at rest (SSE-S3) and key/CA rotation | planned |
| v0.8 | Upgrade machinery: feature gates, health interlock, the upgrade test suite | planned |
| v0.9 | Zero-downtime rolling upgrades | planned |
| v0.10 | Observability/Telemetry | planned |
| v0.11 | Web console | planned |
| TBD | TBD prior to v1 | planning |
| v1.0 | Software updates and migrations supported from v1 | planned |
Grab a binary from the releases page (or go build ./cmd/hamster with Go installed — no cgo, no build tricks), then start the server. The HAMSTER_* variables define the credentials it will accept:
export HAMSTER_ACCESS_KEY_ID=hamster
export HAMSTER_SECRET_ACCESS_KEY=keep-this-one-secret
hamster serve -data-dir ./dataThat is a standard S3 endpoint on 127.0.0.1:9000, so any S3 client works as is. The client sends its own credentials — the standard AWS_* variables, not the HAMSTER_* ones — set to the same values:
export AWS_ACCESS_KEY_ID=hamster
export AWS_SECRET_ACCESS_KEY=keep-this-one-secret
aws --endpoint-url http://127.0.0.1:9000 s3 mb s3://stash
aws --endpoint-url http://127.0.0.1:9000 s3 cp video.mp4 s3://stash/aws s3, rclone, restic, and s3cmd work too — a compatibility suite runs all four against every change. hamster serve is a single durable node, the simplest way in; the cluster below now serves the same S3 API with objects erasure-coded across its nodes (v0.3). Still a dev preview either way — don't trust real data to it yet.
The cluster is Raft-replicated metadata (v0.2) plus an erasure-coded data path (v0.3): mutual TLS between nodes with zero TLS configuration, single-use join tokens, and — with -s3 — the full S3 API on every node, objects spread k+m across the cluster and reconstructed from any k. Writes commit on the Raft leader for now (a non-leader answers 503 and clients retry); multipart and server-side copy join this path in a later release.
Three terminals, all sharing the credentials the S3 API will accept:
# in every terminal — the keys each node's S3 endpoint accepts (-s3 requires them)
export HAMSTER_ACCESS_KEY_ID=hamster HAMSTER_SECRET_ACCESS_KEY=keep-this-one-secret
# terminal 1 — found the cluster, serve S3 on :9000
hamster cluster init -data-dir ./n1 -node n1 -listen 127.0.0.1:7946
hamster cluster run -data-dir ./n1 -s3 127.0.0.1:9000
# terminal 2 — mint a single-use token and join in one command, serve S3 on :9001
TOKEN=$(hamster cluster token -data-dir ./n1)
hamster cluster run -data-dir ./n2 -node n2 \
-listen 127.0.0.1:7956 -token "$TOKEN" -s3 127.0.0.1:9001
# terminal 3 — same again, serve S3 on :9002
TOKEN=$(hamster cluster token -data-dir ./n1)
hamster cluster run -data-dir ./n3 -node n3 \
-listen 127.0.0.1:7966 -token "$TOKEN" -s3 127.0.0.1:9002Watch it: hamster cluster status -data-dir ./n1 shows every member and who leads. Nodes join as learners and are promoted to voters automatically (capped at five voters no matter how large the cluster grows). If a majority of voters is ever permanently lost, hamster cluster recover rebuilds a cluster from a survivor — read its warning first.
Now it stores objects, not just metadata. Point any S3 client at a node and the data is erasure-coded across all three:
export AWS_ACCESS_KEY_ID=hamster AWS_SECRET_ACCESS_KEY=keep-this-one-secret
aws --endpoint-url http://127.0.0.1:9000 s3 mb s3://vault
aws --endpoint-url http://127.0.0.1:9000 s3 cp video.mp4 s3://vault/Kill a node and the object still reads — reconstructed from the survivors. Writes commit on the leader's node in v0.3, so if one answers 503 SlowDown, retry against another.
There are two ways to run Hamster today, and they are separate paths. Pick by what you are doing: to try Hamster or run a workload that fits on one machine, a single serve node is the fastest start; for anything you need to keep durable across machines — and certainly for compliance workloads that use versioning or object lock — start with a cluster, where mutual TLS is configured for you from the first node. A single serve node can be migrated into a cluster later (below), but with caveats around versioning and object lock that make starting clustered the better choice the moment those features matter.
A single node — the simplest thing. hamster serve is a standalone S3 endpoint backed by one node's disk (the Quick start above). No cluster machinery, no Raft, no inter-node TLS, no certificate authority — nothing to configure. It is the right choice for a laptop, a homelab box, or any workload that fits on one machine and does not need to scale out. Note it is genuinely not a cluster of one — for a single node on the path that can grow, use cluster init instead. The trade-off: its durability is one disk's durability, and it cannot become a cluster in place (see below).
hamster serve -data-dir ./data # one node, S3 on :9000A cluster — durable across machines, and able to grow. hamster cluster init founds a cluster; cluster init mints the cluster CA once, automatically, and every node you add reuses it (the Cluster preview above shows three). Objects are erasure-coded k+m across the nodes and reconstructed from any k, so you can lose drives or whole machines without losing data. A one-node cluster is a valid starting point — it runs Raft and elects itself leader (a quorum of one), serving S3 just like serve but on the path that scales and can admit peers later:
hamster cluster init -data-dir ./n1 -node n1 -listen 127.0.0.1:7946
hamster cluster run -data-dir ./n1 -s3 127.0.0.1:9000 # one-node cluster, S3 on :9000
# later: mint a token on n1, then `cluster run -token …` on n2, n3, …Growing a cluster (one, two, or more nodes). Any deployment that is already a cluster grows the same way, with no data migration: mint a join token, run the new node with -token, and it joins as a learner and is promoted to voter automatically (up to a five-voter cap). You are adding a member, not moving data, so going from two nodes to three is just another join. (Two nodes is itself a cluster, and an awkward one — Raft needs both of two voters for a quorum, so it tolerates no failures; three is the first size that survives losing one. That is a reason to reach three, not a different way of getting there.) Existing objects climbing from a single-node profile up to a wider k+m as the cluster gains capacity is the v0.4 placement work (in progress).
Growing a single serve node into a cluster (the homelab path). This is the one growth with no in-place path, and it is meant for the person who started on a single serve node to try Hamster and now wants real durability across machines — not for compliance workloads, which should start clustered (above). serve stores single-node blobs while a cluster stores erasure-coded shards, so there is nothing to promote — instead you migrate the data. Stand up the new cluster alongside the old node, then move objects from one S3 endpoint to the other, deleting each from the source once it is durably on the destination: you never need room for two full copies, and an interrupted run resumes where it left off. Both ends speak S3, so for ordinary objects this works today with rclone move (or any S3 sync-then-delete) — no Hamster-specific tool required:
# old serve node and new cluster configured as two rclone remotes
rclone move s3old:bucket s3new:bucket # copies, then deletes each object from the source as it landsKnow plainly what a generic S3 copy does and does not carry:
- Current object data — yes. Every object's latest version moves intact.
- Version history — no (lands v0.5). Only the current version of each object is copied; older versions and delete markers are left behind on the source.
- Object-lock / WORM state — no (lands v0.6). Retention settings and legal holds do not transfer. And a COMPLIANCE-locked object cannot be deleted from the source until its retention expires — that lock has no override by design — so such data can only be copied and left in place, never moved.
The takeaway, stated plainly: if you keep versioned or locked data, do not plan to migrate — start with a cluster. This path is for evaluation and homelab growth, where the live set of objects is what matters and history is not. A native, lock- and version-aware migration tool is a possible future convenience, not a reason to defer starting clustered when those features matter.
- v0.x — core PUT and GET, erasure coding with repair, partitioned placement, versioning, object lock, the simulation harness, and the upgrade test suite. On disk and on wire formats may change between v0 releases.
- v1.0 — stable formats with a compatibility promise, zero downtime rolling upgrades, and the web console.
- Glossary — the vocabulary (object, version, shard, stripe, partition, node, cluster, layout, …), grouped by layer. Start here if a term is unfamiliar.
- Architecture — the system design narrative: request paths, metadata/data separation, erasure coding, placement, upgrades.
- Architecture Decision Records — one decision per file, with the reasoning and the rejected alternatives.
- Roadmap — the v0.x and v1.0 milestones.
Early, but contributions are welcome. Hamster is Apache 2.0 licensed, and contributions are accepted under a Developer Certificate of Origin (DCO). Sign your commits with git commit -s.
Apache License 2.0. See LICENSE.
High level only — details live in each release. On disk and on wire formats may change between v0 releases.
- v0.3 (June 2026) — Erasure coding and self-healing repair. Objects are erasure-coded into
k+mself-describing shards spread across distinct nodes and reconstructed from anyk; only the small metadata commit ever touches the Raft log.hamster cluster run -s3serves the full S3 API from the cluster, with the write-ack rule enforced mechanically (allk+mdurable on the healthy path, a hard floor ofk+1,SlowDownbelow it). A repair sweep scrubs every shard against its replicated checksum and rebuilds missing or bit-rotted shards from anykverified survivors, without anyone reading the object first. The whole data path runs under the deterministic simulation harness, and a six-node e2e kills nodes mid-workload over real sockets. Leader-only writes; multipart and server-side copy land on this path later. - v0.2 (June 2026) — Clustering foundations. The Raft-replicated metadata plane as a runnable preview:
hamster cluster(init, token, join, run, status, recover), mutual TLS between nodes with no plaintext mode and zero TLS configuration, single-use CA-pinned join tokens, automatic learner-to-voter promotion under a five-voter cap, crash-safe log compaction with streamed snapshot catch-up, and disaster recovery from a surviving node. Deterministic election timing makes the whole consensus layer simulation-testable, and an e2e suite drives the real binary through the full lifecycle. S3 serving stays single-node until the data path replicates (v0.3). - v0.1 (June 2026) — The single-node store. Core S3 API: objects, listings, multipart uploads, server-side copies, batch deletes, presigned URLs; full SigV4 authentication including
aws-chunkedstreaming; path-style and virtual-hosted addressing; MD5 ETags, exactly like S3. Uploads stream through the write buffer (a 1 GiB PUT needs ~12 MB of server memory). Durable single-node storage: BadgerDB metadata, versioned protobuf formats with golden-pinned encodings. Verified by a third-party client compatibility suite (awsCLI, rclone, restic, s3cmd) and a deterministic simulation harness that crash-tests the store against a reference model. Dev preview — single node, not production ready.