Under construction 🚧
Built for performance. Designed for graphs.
Otter is a lightweight, purpose-driven proxy and gateway for
Dgraph. Dgraph balances reads at the predicate
level; Otter adds the missing half — balancing writes across
alphas. It routes traffic per request purpose (query / mutation
/ upsert), proxies the HTTP admin surface, and exposes a WebSocket
gateway that speaks the same shapes as the HTTP endpoints.
Longer-term, Otter is intended to become the foundation for additional
graph query languages, introspection tooling, and an opinionated
graph-modelling convention on top of Dgraph. Those pieces are
research directions, not committed work — see the
"Later / research" horizon in the roadmap below and the notes in
why.md and internal/loadbalancer/idea.md.
Not a goal: supporting GraphQL as a first-class query language.
Dgraph already ships its own GraphQL layer; Otter only routes
/graphql pass-through and reads Dgraph's GraphQL schema for
introspection tooling (e.g. /ui/keywords). Transpiling or
reimplementing GraphQL is explicitly out of scope.
Read why.md for the long version. The short version: Dgraph
distributes reads across alphas automatically (tablets owned by Raft
groups), but mutations land on a single alpha and are the practical
bottleneck for write-heavy workloads. Otter closes that gap by
letting the operator route each request purpose to a chosen group
of alphas, and — as cluster-state inspection lands
(see docs/loadbalancer_audit.md) — by making that routing
leader- and health-aware instead of operator-declared.
Current design overview:
- Round-robin and purposeful load balancing across Dgraph alphas
- HTTP proxy for Dgraph
/query,/mutate,/alter,/graphql,/health,/state,/admin/schema,/ui/keywords - WebSocket gateway with
auth,ping,query,mutation,upsert - Static
/validate/dqland/validate/schemaendpoints (no round-trip) - Configurable via YAML and environment variables, with dev-vs-production safety defaults
- Working today: round-robin and purposeful balancing, HTTP proxy for
/query,/mutate,/alter,/health,/state,/admin/schema,/graphql, a WebSocket gateway, and YAML + environment configuration. - Dev-mode safety defaults: when
dev_mode: true(the shipped default), the WebSocket handler auto-generates an ephemeral auth token at startup and accepts anyOrigin; both are logged as warnings. Setdev_mode: falsewith explicitws_tokenandws_allowed_originsfor fail-closed behaviour. Seedocs/security.md. - Experimental / under audit: the
purposefulbalancer, the GraphQL pass-through, the body-size caps (1 MiB default for HTTP bodies and WS messages), and the HTTP server timeouts all ship with conservative defaults that have been smoke-tested end-to-end but not stress-tested. - Not implemented despite appearing in notes: health-aware balancing,
leader-aware routing, Cypher transpilation, and the UID-reservation /
named-graph convention from
why.md,internal/loadbalancer/idea.md, anddocs/design/uid_reservation.mdare research directions, not committed work. They are kept in the repo because they shape the long-term design. - Dgraph version posture: the Go module depends on
hypermodeinc/dgraph/v24for GraphQL schema handling, whileexamples/cluster/docker-compose.ymlpinsdgraph/dgraph:v25.0.0-preview1. gRPC traffic is compatible in practice but schema behaviour across majors is not guaranteed; smoke-test before relying on it.
Full step-by-step commands live in docs/runbook.md. Short version:
make test/make test-unit— fast local tests, no Docker.make e2e-up+make e2e-wait— boot the Docker stack and wait until Otter answers on/healthand/query.make test-e2e— run the Docker-backed suite (build tage2e).make e2e— one-shot: up, wait, seed, test, tear down.make e2e-down— stop the stack and drop its named volumes.
go test ./... only runs unit tests; the E2E suite is gated by the
e2e build tag so a clean checkout stays green without Docker.
Otter intentionally splits tests into two tiers so contributors can move fast without Docker, while keeping a deterministic end-to-end path for CI:
| Tier | Command | Docker | Build tag | What it covers |
|---|---|---|---|---|
| Unit / internal | make test-unit |
no | none | handlers, config, balancers, helpers, parsers, websocket |
Default go test |
go test ./... |
no | none | same as unit; E2E packages compile but have no tests exposed |
| Docker-backed E2E | make test-e2e |
yes | e2e |
live HTTP + WS round-trips against a real Dgraph cluster |
| One-shot E2E | make e2e |
yes | e2e |
up, wait for readiness, seed, run E2E, tear down |
The e2e build tag is the single source of truth. A test that needs a
running Otter on localhost:8084 / localhost:8089 must carry the
//go:build e2e header so default runs stay hermetic.
Full details, ports, and env vars live in docs/runbook.md.
Requirements: Docker + Docker Compose (OrbStack works). make is
optional but assumed by the commands below.
make rund # foreground, logs streamed
# or
make e2e-up # background; pair with make e2e-waitIf you don't have make:
cd examples/cluster
docker compose up --buildgit clone https://github.com/OpenDgraph/Otter.git
cd Otter
export CONFIG_FILE=./manifest/config.yaml
go run ./cmd/proxymanifest/config.yaml points at localhost:9080 / localhost:9088
and uses the defined balancer. Edit balancer_type to round-robin
if you want a single flat list:
balancer_type: round-robinOtter loads config from the YAML file pointed to by CONFIG_FILE, with
environment variables taking precedence over file values for every knob
that has an env override. The Docker compose example uses
manifest/config_docker.yaml; local runs usually point at
manifest/config.yaml.
Main knobs (defined in internal/config/config.go):
| YAML key | Env var | Default | Purpose |
|---|---|---|---|
balancer_type |
BALANCER_TYPE |
round-robin |
round-robin, defined, or purposeful |
dgraph_endpoints |
DGRAPH_ENDPOINTS |
— | Comma-separated alpha endpoints used by round-robin |
groups |
— | — | Per-purpose endpoint map (query, mutation, upsert) |
proxy_port |
PROXY_PORT |
8080 |
HTTP port; Docker example uses 8084 |
websocket_port |
WEBSOCKET_PORT |
8089 |
WebSocket port |
enable_http |
ENABLE_HTTP |
true |
Toggles the HTTP proxy server |
enable_websocket |
ENABLE_WEBSOCKET |
true |
Toggles the WebSocket server |
graphql |
GRAPHQL |
true |
Enables the /graphql pass-through |
dgraph_user |
DGRAPH_USER |
empty | Dgraph ACL user; optional |
dgraph_password |
DGRAPH_PASSWORD |
empty | Dgraph ACL password; never logged |
dev_mode |
DEV_MODE |
true |
Enables dev-only safety defaults; set false for production |
ws_token |
WS_TOKEN |
auto in dev | WebSocket auth token; required when dev_mode: false |
ws_allowed_origins |
WS_ALLOWED_ORIGINS |
empty | Origin allow-list; required when dev_mode: false |
max_body_bytes |
MAX_BODY_BYTES |
1048576 |
HTTP request body cap (1 MiB) |
ws_max_message_bytes |
WS_MAX_MESSAGE_BYTES |
1048576 |
WebSocket message cap (1 MiB) |
cors_allowed_origins |
CORS_ALLOWED_ORIGINS |
empty | Origin allow-list for CORS; required to enable browser credentials |
rate_limit_rps |
RATE_LIMIT_RPS |
0 |
Per-IP token-bucket refill rate; 0 disables the limiter |
rate_limit_burst |
RATE_LIMIT_BURST |
0 |
Per-IP burst size; defaults to rate_limit_rps when unset |
trusted_proxy_cidrs |
TRUSTED_PROXY_CIDRS |
empty | CIDRs whose X-Forwarded-For header is trusted for rate limiting |
dgraph_http_endpoints |
DGRAPH_HTTP_ENDPOINTS |
empty | gRPC→HTTP endpoint map; falls back to grpcPort - 1000 when empty |
ratel |
RATEL |
empty | Ratel UI host for the /ratel redirect |
ratel_graphql |
RATEL_GRAPHQL |
true |
Whether the Ratel redirect carries GraphQL support |
dgraph_password and ws_token are redacted from the startup log dump.
See docs/security.md for the full security contract.
| Endpoint | Method | Description |
|---|---|---|
/query |
POST | Executes a DQL query via the configured balancer |
/mutate |
POST | Executes a DQL mutation (including upsert blocks) |
/alter |
POST | Schema alter; proxies to the selected alpha |
/graphql |
POST | Pass-through to Dgraph's /graphql (when graphql: true) |
/health |
GET | Aggregated Otter + backend health |
/state |
GET | Proxies Dgraph's /state for cluster introspection |
/admin/schema |
POST | GraphQL admin schema update |
/ui/keywords |
GET | Keyword list used by Ratel / the Otter UI |
/validate/dql |
POST | Static validation of a DQL body without execution |
/validate/schema |
POST | Static validation of a DQL schema without execution |
Supported request Content-Types for DQL endpoints:
application/jsonapplication/dql
Example request (default proxy port is 8084 in the shipped manifests):
curl -X POST http://localhost:8084/query \
-H "Content-Type: application/json" \
-d '{"query": "{ data(func: has(email)) { uid name email } }"}'Request bodies are capped by max_body_bytes (default 1 MiB). Bodies
larger than the cap are rejected with HTTP 413 Payload Too Large.
WebSocket messages are capped separately by ws_max_message_bytes
(default 1 MiB); oversized frames are rejected by the WS read path.
URL: ws://localhost:8089/ws
Otter ships with
dev_mode: trueby default, which means the WebSocket handler auto-generates a random auth token on startup (logged once) and accepts everyOrigin. The Docker example pinsws_token: "banana"for reproducibility — change it before exposing port 8089.Set
dev_mode: falseand providews_tokenandws_allowed_origins(or theDEV_MODE,WS_TOKEN,WS_ALLOWED_ORIGINSenv vars) to run with the fail-closed behaviour. Seedocs/security.mdfor details.
Supported message types:
auth— send first to authenticate the connectionping— keep the connection alivequery/mutation/upsert— require a prior successfulauth
Example (after auth; replace the token with the value configured in
ws_token or the one printed at Otter startup in dev mode):
{
"type": "query",
"query": "{ data(func: has(email)) { uid name email } }",
"token": "<your ws_token>",
"verbose": true
}Available types:
round-robin(default)definedorpurposeful(per-purpose: query/mutation/upsert)
To use defined, provide a YAML like this:
balancer_type: defined
groups:
query:
- localhost:9080
mutation:
- localhost:9081
upsert:
- localhost:9082Otter now validates groups at startup and fails fast when the map is
empty or any purpose has no usable endpoint.
Organised by horizon so it matches the backlog that contributors can
actually pull from. See docs/phase1_backlog.md for the ranked version
with effort and dependency notes.
Implemented today
-
round-robinbalancer -
defined/purposefulbalancer with per-purpose groups and startup validation - Configurable WebSocket auth token, origin allow-list, and dev-vs-production mode (fail-closed)
- HTTP server with explicit read/write/idle timeouts, body-size cap, and SIGINT/SIGTERM graceful shutdown
- Build-tag-gated E2E suite with Docker-backed
make e2eone-shot - Redacted config log dump (password and WS token)
Now (nearest-term follow-ups)
- Structured logging (
log/slog), request IDs, basic metrics - Translate oversize-body rejections to HTTP 413 explicitly
- Cap incoming WebSocket messages (
ws_max_message_bytes) - Basic per-IP rate limiting on
/query,/mutate,/graphql,/ws(rate_limit_rps/rate_limit_burst) - CORS allow-list with credentials gated behind matched origins
(
cors_allowed_origins) - Replace
port - 1000heuristic with explicitdgraph_http_endpointsmap (legacy formula remains a logged fallback)
Next
- Health-aware round-robin (
round-robin-healthy) - Read/write split balancer (
round-robin-on-RW) - Cluster state inspection that feeds balancer decisions
- Container health-checks backed into
docker-compose.yml(removing the host-sidescripts/wait-for-otter.shstopgap)
Later / research
Tracked in why.md, internal/loadbalancer/idea.md, and the
discussion docs under docs/design/. These are vision documents,
not committed scope:
- Leader-aware routing (
round-robin-avoid-leaders,round-robin-leaders-only) - State-based balancing using
/stateand resource introspection - UID-reservation / named-graph convention
(see
docs/design/uid_reservation.md) - Predicate-prefix sharding on top of named graphs
- Cypher / other transpilers (see
internal/astneo) - Otter-as-framework
