feat(openaudio): auto-tune Postgres memory and WAL defaults at container start by RolfAris · Pull Request #220 · OpenAudio/go-openaudio

RolfAris · 2026-05-02T11:34:23Z

Summary

The audiusd container ships stock Debian Postgres 15 defaults (shared_buffers = 128MB, work_mem = 4MB, effective_cache_size = 4GB), sized for a tiny dev VM rather than a validator host. This adds an entrypoint shim that picks a memory and WAL tier from detected host RAM and writes one drop-in conf file at container start.

Suggesting, not requiring. Happy to scope down or close. We're running an equivalent tuning on our 20-node fleet and it's helped meaningfully, so wanted to put it in front of the team.

Tier table

Host RAM	shared_buffers	work_mem	maint_work_mem	effective_cache_size	wal_buffers	max_wal_size	min_wal_size
< 2 GB	(skip, stock defaults)
2 to 4 GB	256 MB	4 MB	128 MB	1 GB	8 MB	1 GB	256 MB
4 to 8 GB	1 GB	8 MB	256 MB	3 GB	16 MB	2 GB	512 MB
8 to 16 GB	2 GB	16 MB	512 MB	6 GB	16 MB	2 GB	1 GB
16 to 32 GB	4 GB	32 MB	1 GB	12 GB	16 MB	4 GB	1 GB
32 to 64 GB	8 GB	32 MB	2 GB	24 GB	16 MB	8 GB	2 GB
64 GB and up	16 GB	32 MB	2 GB	48 GB	16 MB	16 GB	2 GB

Sizing rules: shared_buffers near 25% of RAM (capped to leave headroom for the audiusd Go process, observed at roughly 7 GB RSS on a busy validator). effective_cache_size near 50% (more conservative than pgtune's 75% since Postgres is not the only tenant in this container). wal_buffers capped at 16 MB per the Postgres docs. work_mem modest because audiusd's observed concurrency is roughly 8 connections, not 100.

Conservative-by-default behavior

The shim skips with stock defaults whenever it cannot prove safety:

Operator already tuned postgresql.conf. If shared_buffers, work_mem, maintenance_work_mem, effective_cache_size, wal_buffers, max_wal_size, or min_wal_size is set uncommented in postgresql.conf, the shim skips with a log line. Operator wins.
Operator already has an include_dir directive. Any include_dir line (active, commented out, or pointing at a different directory or using different quoting) makes the shim skip rather than risk last-occurrence-wins overriding the operator's directory.
Non-root execution context. If the shim runs as a uid that is not root and not the postgres user, it skips. The chown step would otherwise silently fail and leave a conf file postgres cannot read.
Postgres rejects the rendered conf. A postgres -C shared_buffers --config-file=... preflight runs after rendering. If Postgres refuses to parse the conf, the shim removes the rendered file and exits.
Any I/O failure. Every error path exits 0 and leaves stock defaults in place.

Override knobs

# Disable entirely
docker run -e AUDIUSD_DISABLE_AUTO_TUNE=1 ...

# Per-setting override (later wins by include order within conf.d)
echo "shared_buffers = 8GB" > $DATA/conf.d/99-operator.conf

# Or via SQL after connect (postgresql.auto.conf is processed last)
ALTER SYSTEM SET shared_buffers = '8GB';

Precedence (later wins):

postgresql.conf top-of-file
conf.d/00-audiusd-defaults.conf (this shim, written conditional on the conservative checks above)
conf.d/99-*.conf (operator override slot)
postgresql.auto.conf (ALTER SYSTEM, processed last by Postgres regardless of position)

Evidence: controlled before/after on one of our nodes

Same node (24 GB host, 144 GB DB, 38 GB indexes), 20-min steady-state windows on each side, reset stats between. Only shared_buffers, wal_buffers, max_wal_size, and min_wal_size were changed (the restart-required group). The other tier values were already in place via ALTER SYSTEM on that node.

Metric	Before: stock 128 MB `shared_buffers`	After: 4 GB `shared_buffers` (16-32 GB tier)
`pg_stat_bgwriter.buffers_alloc` rate	23,068 / sec	65 / sec (-99.7%)
`pg_stat_bgwriter.buffers_backend` (window)	1,247,356	11,591 (-99.1%)
`pg_stat_bgwriter.buffers_checkpoint` (window)	91	35,169 (planned writes replace emergency backend writes)
Buffer hit ratio	10.97% (depressed by cold EXPLAINs in the window)	83.05%

The shape of buffer accounting changed. Before: backends doing emergency dirty-page writes (1.25M of them) because shared_buffers was exhausted. After: the checkpointer does planned batched writes on schedule (35k). The 99% drop in buffers_backend is the strongest signal that shared_buffers was undersized.

Representative heavy query, SELECT count(*) FROM ops WHERE "table" = 'uploads' (full table scan over a 50 GB table):

Metric	Before	After
Execution time	52,481 ms	41,034 ms (-22%)
Buffers dirtied during the query	1,180,495	5,623 (-99.5%)
Buffers written during the query	1,180,370	5,474 (-99.5%)
Buffers read from disk	6,300,680	6,304,652 (unchanged; table doesn't fit in 4 GB either)

Light queries (e.g. SELECT * FROM core_blocks ORDER BY created_at DESC LIMIT 100) went from 9 to 14 disk reads down to 0. Index pages stay resident in the bigger pool. Sub-ms either way, but the disk-read count delta is the durable signal.

Restart cost: roughly 3 seconds Postgres unavailability. The audiusd Go process kept running and reconnected; no application errors observed.

Reproduce on any node:

SELECT pg_stat_reset_shared('bgwriter'); SELECT pg_stat_reset();
-- wait 20 min steady-state, capture:
SELECT round(100.0 * blks_hit / (blks_hit + blks_read), 2) AS hit_ratio_pct
  FROM pg_stat_database WHERE datname = 'openaudio';
SELECT buffers_alloc, buffers_backend, checkpoints_req
  FROM pg_stat_bgwriter;

Determinism and consensus safety

A tuning that changes plan choice (work_mem, effective_cache_size) could in principle affect consensus state if any state-applied query relied on default plan ordering. We audited the currently visible ORDER-sensitive paths:

All :many ... LIMIT queries in pkg/core/db/sql/reads.sql have explicit ORDER BY.
:one ... LIMIT 1 queries WHERE on a unique column.
Unordered queries (GetAllRegisteredNodes, GetAllEthAddressesOfRegisteredNodes, GetActiveStorageNodeEndpoints) feed only common.GetAttestorRendezvous, which sorts internally by hash. The output is order-independent.
The CRDT ops sweep is Order("ulid asc") server-side (pkg/mediorum/server/serve_crud.go:33).

We did not find a path where a plan flip could change consensus state. This is an audit, not an executable guard. Adding a deterministic-order assertion test would harden this further; happy to do that if it would help review.

Caveats for operators

shared_buffers is restart-required. On image upgrade, docker compose up -d recreates the container and Postgres starts with the new value. On hosts with very tight free RAM at upgrade time, the larger allocation may cause Postgres start to fail. Workaround: set AUDIUSD_DISABLE_AUTO_TUNE=1 before upgrading, or override via conf.d/99-*.conf.
/dev/shm size on big tiers. Postgres 15's parallel workers use /dev/shm for dynamic shared memory. Docker defaults /dev/shm to 64 MB. On the 64 GB and up tier (16 GB shared_buffers), parallel queries with many workers can hit could not resize shared memory segment. Operators on big hosts should pass --shm-size=2g or larger.
Non-root container runtimes. k8s securityContext.runAsUser, rootless docker, or podman with --user make the in-container chown postgres:postgres no-op. The shim detects this and skips. Operators in those modes will see stock defaults, which is the existing behavior.

Out of scope

random_page_cost, effective_io_concurrency (assume SSD), synchronous_commit, wal_compression, checkpoint_*, max_connections. Memory and WAL sizing only.

Test plan

bash cmd/openaudio/postgres-auto-tune_test.sh, 161 assertions covering: every tier at midpoint and at both boundary edges (one inside the tier, one just below it); sub-floor; idempotency across re-runs; AUDIUSD_DISABLE_AUTO_TUNE=1 short-circuit; =true and =0 correctly NOT honored (canonical form is =1); operator-tuned postgresql.conf skip; commented-tuning does NOT trigger skip; foreign include_dir skip (alternate dir, double-quoted, commented); existing include_dir = 'conf.d' recognized as ours; well-formed postgresql.conf after atomic append; orphan tmp file cleanup; tier log line. Lint-clean (shellcheck).
CI builds the image
Fresh-init container, Postgres starts with shim-applied defaults
Existing-data-dir, include_dir = 'conf.d' appended once via atomic temp+rename, drop-in renders, restart picks up new shared_buffers
AUDIUSD_DISABLE_AUTO_TUNE=1, no conf.d directory, stock defaults
docker run -m 1G (cgroup-limited container), shim detects via cgroup, sub-2GB skip, stock defaults
Pre-existing postgresql.conf with hand-tuned shared_buffers, shim detects and skips with log line

…ner start The audiusd container ships stock Debian Postgres 15 defaults (shared_buffers=128MB, work_mem=4MB, effective_cache_size=4GB) which are sized for a tiny dev VM rather than a validator host. Adds an entrypoint shim that picks a memory and WAL tier from detected host RAM and writes a single drop-in conf at $POSTGRES_DATA_DIR/conf.d/. Conservative-by-default: skips with stock defaults when postgresql.conf already has any of the tuned parameters set, when any include_dir directive is already present (active, commented, or pointing at a different dir), when running as a non-root non-postgres uid, when postgres -C preflight rejects the rendered conf, or on any I/O failure. Disable with AUDIUSD_DISABLE_AUTO_TUNE=1 or override via conf.d/99-*.conf or ALTER SYSTEM. Atomic writes (mktemp + rename) on both the tune file and postgresql.conf. Cgroup-aware memory detection (v2 then v1 then /proc/meminfo). Tested: 161 assertions covering every tier midpoint, every boundary value (2048, 4095, 4096, 8191, 8192, 16383, 16384, 32767, 32768, 65535, 65536), sub-floor cases, disable variants, operator-tuning detection, foreign include_dir detection, atomic append well-formedness, orphan tmp cleanup, and the tier log line. shellcheck clean.

…-auto-tune-defaults

RolfAris added 2 commits May 2, 2026 20:34

Merge remote-tracking branch 'origin/main' into feat/audiusd-postgres…

c0ee6d0

…-auto-tune-defaults

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(openaudio): auto-tune Postgres memory and WAL defaults at container start#220

feat(openaudio): auto-tune Postgres memory and WAL defaults at container start#220
RolfAris wants to merge 2 commits intoOpenAudio:mainfrom
RolfAris:feat/audiusd-postgres-auto-tune-defaults

RolfAris commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

RolfAris commented May 2, 2026

Summary

Tier table

Conservative-by-default behavior

Override knobs

Evidence: controlled before/after on one of our nodes

Determinism and consensus safety

Caveats for operators

Out of scope

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant