fix: tighten state-sync defaults (keep=2, blockInterval=100000)#255
Open
raymondjacobson wants to merge 2 commits intomainfrom
Open
fix: tighten state-sync defaults (keep=2, blockInterval=100000)#255raymondjacobson wants to merge 2 commits intomainfrom
raymondjacobson wants to merge 2 commits intomainfrom
Conversation
Each state-sync snapshot is currently ~30-45GB. With Keep=6 a snapshot-serving node accumulates ~180GB+ of snapshots on top of chain data and Postgres, which exhausted disk on creatornode2.audius.co (1TB PVC, 100% full) and put Postgres into a checkpoint PANIC / recovery loop. Two snapshots is enough to serve incoming state-syncers (a fresh one plus a fallback) while leaving headroom for chain growth. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A 100-block interval would create a new snapshot roughly every minute on mainnet, which is far too aggressive — both for disk and CPU. Production already overrides this to 100000 (the values seen in /data/bolt/snapshots_*/height_0024200000 etc.); align the default with the value validators are actually using. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related default changes in
pkg/core/config/config.goto bring the state-sync defaults in line with what production validators actually use:stateSyncKeep:6→2— each snapshot is currently ~30–45 GB. WithKeep=6a snapshot-serving node accumulates ~180 GB+ of snapshots on top of chain data and Postgres. This exhausted the 1 TB disk oncreatornode2.audius.cotoday (the only prod node withstateSyncServeSnapshots=true), which put Postgres into a checkpoint PANIC / recovery loop until snapshots were manually deleted. Two snapshots is enough to serve an incoming state-syncer (newest) plus one fallback (in case the newest is mid-creation).stateSyncBlockInterval:100→100000— a 100-block interval would create a new snapshot roughly every minute on mainnet, which is far too aggressive both on disk and CPU. Prod already overrides this to100000(the height boundaries seen in/data/bolt/snapshots_*/height_002420****etc.); align the default with what validators actually use.Operators who want different values can still override via the
stateSyncKeep/stateSyncBlockIntervalenv vars.Test plan
creatornode2.audius.co, confirm only the most recent 2 snapshot directories exist under/data/bolt/snapshots_<chainID>/after the next snapshot creation cycle._00000).🤖 Generated with Claude Code