A practical bump allocator for high-throughput Rust workloads
arena-b is a compact, battle-tested bump allocator for workloads that allocate many short-lived objects and prefer bulk reclamation. Allocate objects quickly into contiguous chunks and free them all at once using checkpoints, scopes, or a full reset — eliminating per-object deallocation overhead and fragmentation.
arena-b v1.1.0 focuses on being useful in production, not just looking fast on paper.
The allocator hot paths were cleaned up, checkpoint bookkeeping was fixed, concurrency semantics were tightened, and benchmarks were updated to measure allocator work instead of setup noise.
Arena allocation follows a simple principle: allocate objects sequentially into a contiguous buffer, then free everything simultaneously when the arena is reset or dropped. This approach eliminates per-object deallocation overhead and avoids memory fragmentation entirely.
use arena_b::Arena;
fn main() {
let arena = Arena::new();
// Allocate objects into the arena
let numbers: Vec<&u32> = (0..1000)
.map(|i| arena.alloc(i))
.collect();
// All allocations are freed when the arena is dropped
// Alternatively, call arena.reset() to free manually
}- Fast bump allocation: Extremely low-overhead allocations by bumping a pointer inside chunks.
- Thread-local caches: Per-thread hand-offs for the smallest allocations to reduce contention in multithreaded workloads.
- Lock-free fast-paths: Optional lock-free buffer for very small allocations to reduce synchronization overhead.
- Checkpoint & scopes: Save/restore allocation state (
checkpoint/rewind_to_checkpoint) andscope()for panic-safe temporary allocations. - Virtual memory backing (optional): Reserve large address spaces and commit pages on demand to keep the committed footprint small.
- Slab allocator (optional): Size-class based caching for frequent small object sizes.
- Debug tooling (optional): Guard-based use-after-rewind detection, leak reports and richer diagnostics when the
debugfeature is enabled. - Fine-grained feature flags: Only enable what you need —
virtual_memory,thread_local,lockfree,slab,debug,stats.
This release is a performance and reliability refresh.
- Removed hidden release-path overhead from allocator internals (unconditional logging/backtrace capture in hot paths).
- Fixed checkpoint bookkeeping so repeated
checkpoint+rewind_to_checkpointloops stay stable over time. - Tightened allocation failure handling so unrecoverable allocation failures are explicit.
- Corrected thread-safety semantics:
ArenaisSendbut intentionally notSync; useSyncArenafor shared cross-thread access. - Reworked benchmark methodology to reuse arenas and rewind inside benchmark loops.
| Benchmark | arena-b | baseline | Relative |
|---|---|---|---|
alloc_u64/arena_alloc vs box_new |
~4.7ns | ~42ns | ~8.9x faster |
many_allocs_u64/arena_many vs box_many |
~5.6us | ~27us | ~4.8x faster |
reused_arena_many_u64/arena_reused_scope |
~5.3us | prior run ~33us | large improvement |
Results vary by CPU, OS, compiler, and feature flags; run cargo bench --all -- --quick in your environment for a direct comparison.
- Stabilized public API and feature-gated modules for predictable builds and smaller compile cost when optional features are disabled.
ArenaBuilderto configure arenas declaratively (chunk size, reserve size, thread safety, diagnostics sink).- Graceful virtual memory handling:
Arena::with_virtual_memorylogs and falls back in restricted environments — check logs if you require strict failure behavior. - Improved runtime diagnostics:
Arena::chunk_usage(),virtual_memory_committed_bytes(), andLockFreeStats::cache_hit_rate().
See CHANGELOG.md for the complete release notes and migration tips.
- Generic Lock-Free Pool:
LockFreePool<T>provides thread-safe object pooling with atomic CAS operations for game engines, parsers, and high-frequency allocation patterns - Lock-Free Allocator Control:
LockFreeAllocatorwith runtimeenable()/disable()switching andcache_hit_rate()monitoring - Thread-Local Slabs:
ThreadSlabwith generation-based invalidation for per-thread fast-path allocations - Enhanced Statistics:
LockFreeStatsnow supportscache_hit_rate(),record_deallocation(), and isCloneable for snapshots - Debug Improvements:
DebugStatsincludesleak_reportscounter for tracking leak detection calls
- Proactive Reservation: Call
Arena::reserve_additional(bytes)to pre-grow the underlying chunk before a known burst of allocations, reducing contention in hot paths. - Memory Trimming:
Arena::shrink_to_fitandArena::reset_and_shrinkreclaim any extra chunks after a spike, keeping long-running services lean. - Docs & Tooling: README and changelog now cover the adaptive APIs, and CI stays green with the explicit Clippy allowance on
alloc_str_uninit.
- New Allocation APIs:
alloc_slice_fastaccelerates small slice copies andalloc_str_uninitcreates mutable UTF-8 buffers without extra allocations. - Virtual Memory Telemetry:
virtual_memory_committed_bytes()reports the current committed footprint, while rewinds/resets now guarantee proper decommit on every platform. - Lock-Free Overhaul: Per-thread slab caches reduce contention and eliminate previously observed race conditions in the lock-free buffer.
- Panic-Safe Scopes:
Arena::scopenow rolls back automatically even if the scoped closure unwinds, ensuring arenas remain consistent under failure. - Enhanced Debugging: Runtime validation hooks, leak reports, and optional
debug_backtracecapture provide deep diagnostics when thedebugfeature is enabled.
- Checkpoints: Mark allocation points and rewind instantly for bulk deallocation
- Debug Mode: Detect use-after-free bugs with guard patterns and pointer validation
- Virtual Memory: Reserve large address spaces without committing physical memory upfront
- Thread-Local Caching: Reduce lock contention in multi-threaded workloads
- Lock-Free Operations: Minimize synchronization overhead in high-contention scenarios
Add arena-b to your Cargo.toml:
[dependencies]
arena-b = "1.1.0"# Add to your project
cargo add arena-b
# Run examples
cargo run --example parser_expr
cargo run --example game_looparena-b uses feature flags to minimize compilation overhead:
# Basic bump allocator
arena-b = "1.1.0"
# Development with safety checks
arena-b = { version = "1.1.0", features = ["debug"] }
# Maximum performance for production
arena-b = { version = "1.1.0", features = ["virtual_memory", "thread_local", "lockfree", "slab"] }| Feature | Description | Performance Impact | When to Use |
|---|---|---|---|
debug |
Memory safety validation and use-after-free detection | ~5% overhead | Development & testing |
virtual_memory |
Efficient handling of large allocations via reserve/commit | Memory efficient | Large arena allocations |
thread_local |
Per-thread allocation buffers to reduce contention | 20-40% faster | Multi-threaded workloads |
lockfree |
Lock-free operations for concurrent workloads | 15-25% faster | High-contention scenarios |
stats |
Allocation statistics tracking | Minimal overhead | Performance monitoring |
slab |
Size-class cache for small allocations | 10-20% faster | Mixed allocation sizes |
Suitable for game loops or per-request processing:
use arena_b::Arena;
fn game_loop() {
let arena = Arena::new();
loop {
let checkpoint = arena.checkpoint();
// Allocate frame-specific data
let entities = allocate_entities(&arena);
let particles = allocate_particles(&arena);
// Process the frame...
// Deallocate all frame data at once
unsafe { arena.rewind_to_checkpoint(checkpoint); }
}
}Build complex data structures without manual memory management:
use arena_b::Arena;
struct AstNode<'a> {
value: String,
children: Vec<&'a AstNode<'a>>,
}
fn parse_expression<'a>(input: &str, arena: &'a Arena) -> &'a AstNode<'a> {
let node = arena.alloc(AstNode {
value: input.to_string(),
children: Vec::new(),
});
// Child nodes are allocated in the same arena
// All memory is freed when the arena is dropped
node
}Use SyncArena for concurrent access:
use std::sync::Arc;
use arena_b::SyncArena;
fn main() {
let arena = Arc::new(SyncArena::new());
let handles: Vec<_> = (0..4)
.map(|_| {
let arena = Arc::clone(&arena);
std::thread::spawn(move || {
arena.scope(|scope| {
scope.alloc("thread-local data")
})
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}arena-b performs best when allocations are short-lived and reclaimed in bulk.
If your workload naturally has frame/request/phase boundaries, arenas can remove allocator churn and reduce tail latency.
Representative quick benchmark snapshot from this repository (release profile):
| Workload | arena-b | baseline | Notes |
|---|---|---|---|
single alloc+rewind (alloc_u64/arena_alloc) |
~4.7ns | box_new ~42ns |
very strong fit for transient values |
1024 allocs+rewind (many_allocs_u64/arena_many) |
~5.6us | box_many ~27us |
bulk lifetime wins are clear |
scope-based reuse (arena_reused_scope) |
~5.3us | prior run ~33us | large gain from stable checkpoint bookkeeping |
For your own decisions, always benchmark with your real object sizes and lifetime shapes.
Run the comprehensive benchmark suite:
cargo bench --allView detailed performance reports in benches/ directory.
Recommended use cases:
- Parsers and compilers (ASTs and intermediate representations)
- Game engines and simulations (per-frame or transient data)
- Web servers and request-oriented services (per-request temporary data)
Not suitable for:
- Long-lived objects with mixed lifetimes where individual free is required
- Programs that need fine-grained control of each allocation lifetime
use arena_b::Arena;
let arena = Arena::new();
// Basic allocation
let number = arena.alloc(42u32);
let string = arena.alloc_str("hello world");
let small = arena.alloc_slice_fast(&[1u8, 2, 3]);
let buf = arena.alloc_str_uninit(256); // mutable UTF-8 buffer
// Scoped allocation with automatic cleanup (panic-safe)
arena.scope(|scope| {
let temp = scope.alloc("temporary data");
// Automatically rewound even if this closure panics
assert_eq!(temp, &"temporary data");
});
// Checkpoint-based bulk deallocation
let checkpoint = arena.checkpoint();
// ... perform allocations ...
unsafe { arena.rewind_to_checkpoint(checkpoint); }
// Virtual memory telemetry (feature = "virtual_memory")
#[cfg(feature = "virtual_memory")]
if let Some(bytes) = arena.virtual_memory_committed_bytes() {
println!("Currently committed: {} bytes", bytes);
}
// Statistics
println!("Allocated: {} bytes", arena.bytes_allocated());
println!("Stats: {:?}", arena.stats());Additional documentation is available in the docs/ directory:
docs/guide.md— Comprehensive usage guidedocs/strategies.md— Allocation strategy selectiondocs/advanced.md— Advanced configuration optionsdocs/architecture.md— Internal design and implementation details
Working examples are provided in the examples/ directory:
parser_expr.rs— Expression parser with arena-allocated ASTgame_loop.rs— Game loop with frame-based allocationgraph_pool.rs— Graph traversal with object poolingstring_intern.rs— String interning implementationv0.5_features.rs— Demonstration of v0.5 features
Contributions are welcome. Please consider the following:
- Bug reports and feature requests via GitHub Issues
- Performance improvements with benchmark data
- Documentation corrections and improvements
- For significant changes, please open an issue for discussion first
Licensed under the MIT License. See LICENSE for details.
See CHANGELOG.md for complete version history.
- Slab allocator size-class cache (
slabfeature) Arena::chunk_usage()telemetry for per-chunk capacity/used- Debug tracking consistency across fast allocation paths