From 0f1a71c959ceb2e734f6e0dfbd7c3b10d95b1050 Mon Sep 17 00:00:00 2001 From: Albert Bausili Date: Mon, 11 May 2026 00:14:41 +0200 Subject: [PATCH] =?UTF-8?q?docs:=20archive=20notice=20=E2=80=94=20work=20m?= =?UTF-8?q?oved=20to=20goceleris/probatorium?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Wave 10 of the celeris v1.4.3 cascade: this repo is superseded by goceleris/probatorium which adds the validation tier, cluster orchestration, per-arch matrix, and the publish-to-docs cascade on top of everything benchmarks/ used to do. The README now points users at probatorium and lists where each component moved. The v4.0 result JSONs published from this repo remain readable by probatorium's v5.0 parser (additive bump). Followup: the repo itself gets archived on the GitHub side after this lands. --- README.md | 193 +++++++++--------------------------------------------- 1 file changed, 31 insertions(+), 162 deletions(-) diff --git a/README.md b/README.md index 1040f61..a583eb9 100644 --- a/README.md +++ b/README.md @@ -1,174 +1,43 @@ -# Celeris Benchmarks +# Celeris Benchmarks — archived -Reproducible HTTP server benchmarks on dedicated bare-metal hardware with 10GbE point-to-point networking. Compares production Go frameworks against theoretical maximum performance using raw Linux syscalls. +This repository is **archived as of celeris v1.4.3**. All benchmark + validation work has moved to: -## Why This Exists +**[github.com/goceleris/probatorium](https://github.com/goceleris/probatorium)** -Most HTTP benchmarks run on shared VMs with noisy neighbors, variable network hops, and throttled I/O — making results unreliable and non-reproducible. This suite runs on dedicated bare-metal machines with direct 10GbE links, automated kernel tuning, and CPU pinning, so every release gets consistent, comparable numbers. +## Why the move -We measure three categories of servers: -- **Baseline**: Production Go frameworks (Gin, Fiber, Echo, Chi, Iris, Hertz, FastHTTP, stdlib) -- **Celeris**: The [Celeris](https://github.com/goceleris/celeris) HTTP engine with io_uring, epoll, and adaptive backends -- **Theoretical**: Raw epoll/io_uring implementations showing the syscall performance ceiling +`goceleris/benchmarks` predated: -## Hardware +- `goceleris/loadgen` — the dedicated, HdrHistogram-aware load generator. The bench + loop here forked its own minimal client and could not measure latency under coordinated + omission correction. +- `celeris/test/perfmatrix` — the in-tree scenario / interleave / report scaffolding that + the celeris team uses for release-gate matrix runs. +- The 3-host LACP cluster + ansible orchestration. This repo was structurally + single-host: it ran the loadgen and the server in the same process and could not split + them across machines. +- The validation tier (TigerBeetle-VOPR-inspired property soak + RESTler-style fuzzing + + deterministic-seed fault injection). Probatorium folds bench and validation into one + pipeline so every celeris release is gated on **both** "no regression vs baseline" AND + "no invariant violation under 10-day soak." -Three dedicated Minisforum mini PCs connected via 10GbE point-to-point links: +Probatorium subsumes everything this repo did and adds the validation tier, the cluster +fabric, the per-arch matrix, and the publish-to-docs cascade. -| Machine | Role | CPU | Cores/Threads | RAM | Network | -|---------|------|-----|---------------|-----|---------| -| MS-A2 | Client (self-hosted runner) | AMD Ryzen 9 9955HX (Zen 5) | 16C/32T | 32 GB DDR5 | 10GbE SFP+ | -| MS-A2 | x86 Server | AMD Ryzen 7 7745HX (Zen 4) | 8C/16T | 32 GB DDR5 | 10GbE SFP+ | -| MS-R1 | ARM64 Server | CIX CP8180 | 12C/12T | 64 GB LPDDR5 | Dual 10GbE RJ45 (RTL8127) | +## Where each piece went -All machines run Debian 13 (Trixie) with kernel 6.12+ for full io_uring support. The client machine is the GitHub Actions self-hosted runner that orchestrates everything via SSH. +| In this repo | In probatorium | +| ------------------------------------------ | ----------------------------------------------- | +| `servers/baseline/{gin,echo,chi,...}/` | `servers//` (one go.mod per adapter) | +| `cmd/bench/`, `internal/runner/` | `cmd/runner/` + scenario/interleave packages | +| `internal/dashboard/` | `report/` (v5.0 schema, HdrHistogram-aware) | +| `magefile.go` cluster targets | `mage_cluster.go` + `ansible/` (cluster-driven) | +| Result schema v4.0 | Result schema v5.0 (additive over v4) | +| (no validation tier) | `validation/`, `cmd/validator{,-checker,-replay}/` | -## Benchmark Types - -### Standard Level (7 types, ~66 min per architecture) - -| Type | Endpoint | What It Tests | -|------|----------|---------------| -| `simple` | `GET /` | Plain text — pure framework overhead | -| `json` | `GET /json` | JSON serialization | -| `path` | `GET /users/:id` | Path parameter extraction + routing | -| `body` | `POST /upload` | 2 KB request body read | -| `headers` | `GET /users/:id` | Realistic API headers (~850 bytes: JWT, cookies, tracing) | -| `json-64k` | `GET /json-64k` | 64 KB JSON response — I/O throughput, efficiency metric | -| `churn` | `GET /` | New TCP connection per request — tests `accept()`, `SO_REUSEPORT` | - -### Full Level (15 types, ~142 min per architecture) - -Adds a **concurrency sweep** that scales connections from 1 to 10,000 on the `simple` endpoint: - -``` -simple@1 simple@10 simple@50 simple@100 simple@500 simple@1000 simple@5000 simple@10000 -``` - -This produces scaling curves that show where goroutine-based frameworks plateau and where event-loop servers keep climbing. - -## Servers Tested - -### Production Frameworks (Baseline) - -| Server | Protocols | Framework | -|--------|-----------|-----------| -| stdhttp | H1, H2C, Hybrid | Go stdlib `net/http` | -| gin | H1, H2C, Hybrid | [Gin](https://github.com/gin-gonic/gin) | -| echo | H1, H2C, Hybrid | [Echo](https://github.com/labstack/echo) | -| chi | H1, H2C, Hybrid | [Chi](https://github.com/go-chi/chi) | -| iris | H1, H2C, Hybrid | [Iris](https://github.com/kataras/iris) | -| hertz | H1, H2C, Hybrid | [Hertz](https://github.com/cloudwego/hertz) | -| fiber | H1 | [Fiber](https://github.com/gofiber/fiber) (fasthttp-based) | -| fasthttp | H1 | [FastHTTP](https://github.com/valyala/fasthttp) | - -### Celeris - -| Server | Protocols | Engine | -|--------|-----------|--------| -| celeris-iouring | H1, H2C, Hybrid | io_uring (Linux 5.10+) | -| celeris-epoll | H1, H2C, Hybrid | epoll (Linux 2.6+) | -| celeris-adaptive | H1, H2C, Hybrid | Runtime engine selection | - -Each engine runs with three resource profiles: `latency`, `throughput`, and `balanced`. - -### Theoretical Maximum - -| Server | Protocols | Implementation | -|--------|-----------|----------------| -| epoll | H1, H2C, Hybrid | Raw epoll with SO_REUSEPORT, SIMD header parsing, zero-alloc response path | -| iouring | H1, H2C, Hybrid | io_uring with SQPOLL, multishot accept, linked SQEs | - -## Dashboard & Results - -Results are published to [goceleris/docs](https://github.com/goceleris/docs) as dashboard-format JSON (schema v4.0), keyed by Celeris version: - -- `results/latest/{arch}.json` — most recent run -- `results/{version}/{arch}.json` — per-version archive - -Dashboard data includes: -- **RPS and latency percentiles** (P50, P75, P90, P99, P999, P9999) per server per benchmark type -- **Concurrency scaling curves** — RPS at each concurrency level (full level only) -- **Efficiency metric** — RPS / Server CPU% per server, normalizing across core counts -- **System metrics** — server CPU, memory RSS, GC pauses (Go servers only) -- **Timeseries** — per-second RPS and P99 latency snapshots - -## Running Benchmarks - -Benchmarks are designed to run through GitHub Actions workflows. The self-hosted runner on the client machine handles everything: SSH into servers, deploy binaries, tune kernels, run benchmarks, collect results. - -### Via GitHub Actions (Primary Method) - -- **Release benchmarks**: Trigger automatically on every release, or manually via the `benchmark.yml` workflow dispatch. Releases run at `full` level (includes concurrency sweep). -- **PR benchmarks**: Add the `benchmark` label to a pull request. Runs at `standard` level. - -### Local Development - -For local development and testing (not full benchmarks): - -```bash -# Build server and bench binaries -mage build - -# Run a quick local smoke test (5s per server, localhost) -mage benchmarkQuick -``` - -## CI/CD - -| Workflow | Trigger | Level | Timeout | -|----------|---------|-------|---------| -| `benchmark.yml` | Release (auto) or manual dispatch | `full` on release, configurable on manual | 480 min | -| `benchmark-pr.yml` | PR with `benchmark` label | `standard` | 240 min | - -Both workflows SSH to the bare-metal servers, deploy the server binary, run benchmarks, and collect results. Release runs also publish to the docs repository and trigger a site rebuild. - -## Project Structure - -``` -cmd/bench/ Benchmark runner CLI (specs, runner, checkpoint) -cmd/server/ Server binary (all implementations + control daemon) -servers/ - baseline/ Production frameworks (gin, echo, chi, iris, etc.) - celeris/ Celeris HTTP engine - theoretical/ Raw epoll/iouring implementations - common/ Shared types, payload generators, SIMD helpers -internal/ - dashboard/ Dashboard JSON format (schema v4.0) - metrics/ Prometheus metrics definitions - version/ Version info -config/ - hosts.json Machine addresses and hardware metadata -``` - -## Contributing - -### Requirements - -- **Go 1.24+**: [Download](https://go.dev/dl/) -- **Mage**: `go install github.com/magefile/mage@latest` - -### Development - -```bash -mage check # deps + lint + vet + build -mage test # run tests -mage fmt # format code -``` - -### Adding a Server - -1. Create a package under `servers/baseline/` (or `servers/theoretical/`) -2. Implement all benchmark endpoints: `GET /`, `GET /json`, `GET /json-1k`, `GET /json-64k`, `GET /users/:id`, `POST /upload` -3. Register the server type in `cmd/server/main.go` -4. Add to the server list in `cmd/bench/main.go` - -### Adding a Benchmark Type - -1. Add the endpoint to all server implementations -2. Add a `BenchmarkSpec` entry in `cmd/bench/main.go` -3. Update dashboard format if new fields are needed (`internal/dashboard/format.go`) +The v4.0 result JSONs published from this repo remain readable by probatorium's v5.0 +parser — the schema bump was additive. ## License -Apache 2.0 +[Apache 2.0](LICENSE), unchanged. Use these snapshots for historical reference.