EdgeHDF5

HDF5-backed memory store for on-device AI agents.

EdgeHDF5 is a standalone Rust library that gives any AI agent fast, portable, single-file memory. It stores conversations, embeddings, knowledge graphs, and session history in a single HDF5 file — then searches them at microsecond latency using hardware-adaptive backends (Apple AMX, BLAS, SIMD, GPU).

EdgeHDF5 is framework-agnostic. It works with any agent system that produces embeddings and needs persistent memory — LangChain, custom agent loops, or standalone tools.

rustystack/edgehdf5          MIT License
Rust workspace · 2 crates · ~6.7K LOC

Why HDF5 for Agent Memory?

Concern	SQLite + pgvector / Qdrant	EdgeHDF5 (HDF5)
Deployment	Requires a running database process or client library	Single `.h5` file, no daemon, no network
Vector search	Bolted-on extension; query planner overhead	Native flat arrays with SIMD/BLAS/GPU dispatch
Portability	Tied to OS-level SQLite or container images	One file — copy, snapshot, ship to another device
Memory mapping	Page-level I/O with SQLite WAL	Direct mmap of contiguous float arrays
Typed storage	Blobs or JSON columns for embeddings	First-class N×D float32/float16 datasets
Compression	Row-level, limited	Deflate on datasets; PQ for 8× vector compression
On-device AI	Heavy dependency tree	Zero-network, single-file, deterministic I/O

EdgeHDF5 targets use cases where the agent runs on the user's machine — laptops, edge devices, CI runners — and needs memory that is fast, self-contained, and inspectable.

Architecture

┌──────────────────────────────────────────────────────────────────┐
│                        HDF5Memory (lib.rs)                       │
│         AgentMemory trait: save · search · sessions · kg         │
├──────────┬──────────┬─────────────┬─────────────┬────────────────┤
│  cache   │ session  │  knowledge  │   schema    │   storage      │
│  (LRU)   │  mgmt    │  graph      │  (v1.0)     │  (mmap I/O)   │
├──────────┴──────────┴─────────────┴─────────────┴────────────────┤
│                      Search Layer                                │
│  ┌──────────────────────────────────────────────────────────┐    │
│  │              strategy.rs — Adaptive Dispatch              │    │
│  │  Scalar → SIMD → BLAS → Accelerate → Rayon → GPU → IVF-PQ│   │
│  └────┬────────┬────────┬──────────┬────────┬────────┬──────┘    │
│       │        │        │          │        │        │            │
│    vector   blas     accel      gpu     ivf+pq    hybrid         │
│    search   search   search    search   index    vec+bm25        │
│    (SIMD)   (sgemm)  (cblas)   (wgpu)   (ANN)    (RRF)          │
├──────────────────────────────────────────────────────────────────┤
│  rustyhdf5 stack: rustyhdf5 · rustyhdf5-io · rustyhdf5-accel     │
│                     rustyhdf5-format · rustyhdf5-gpu               │
└──────────────────────────────────────────────────────────────────┘
                              │
                    ┌─────────┴─────────┐
                    │   agent_memory.h5  │
                    │  /meta             │
                    │  /memory           │
                    │  /sessions         │
                    │  /knowledge_graph  │
                    └───────────────────┘

Crate Structure

Crate	Path	Description
edgehdf5-memory	`crates/edgehdf5-memory`	Core library — memory store, search backends, knowledge graph, sessions
edgehdf5-migrate	`crates/edgehdf5-migrate`	CLI tool to migrate existing SQLite agent databases to HDF5

Quick Start

Add the dependency

[dependencies]
edgehdf5-memory = { git = "ssh://git@github.com/rustystack/edgehdf5.git", features = ["float16"] }

For Apple Silicon machines, enable hardware-accelerated search:

edgehdf5-memory = { git = "ssh://git@github.com/rustystack/edgehdf5.git", features = ["float16", "accelerate"] }

Create a memory store and add entries

use edgehdf5_memory::{HDF5Memory, MemoryConfig, MemoryEntry, AgentMemory};
use std::path::PathBuf;

// Configure the memory store
let config = MemoryConfig {
    path: PathBuf::from("agent_memory.h5"),
    agent_id: "my-agent".into(),
    embedder: "openai:text-embedding-3-small".into(),
    embedding_dim: 384,
    chunk_size: 512,
    overlap: 50,
    float16: true,
    compression: true,
    compression_level: 4,
    compact_threshold: 0.3,
    created_at: "2025-01-01T00:00:00Z".into(),
};

let mut memory = HDF5Memory::create(config)?;

// Add a memory entry
let entry = MemoryEntry {
    chunk: "The user prefers dark mode and uses vim keybindings.".into(),
    embedding: embed("The user prefers dark mode..."),  // your embedding fn
    source_channel: "chat".into(),
    timestamp: 1700000000.0,
    session_id: "session-001".into(),
    tags: "preference,ui".into(),
};

let id = memory.save(entry)?;

// Add a session summary
memory.add_session("session-001", 0, 5, "chat", "Discussed UI preferences")?;

Search memories

use edgehdf5_memory::vector_search::{cosine_similarity_batch_prenorm, top_k, compute_norm};

let query = embed("What are the user's UI preferences?");
let query_norm = compute_norm(&query);

// Get the in-memory cache for search
let cache = memory.cache();
let scores = cosine_similarity_batch_prenorm(
    &query,
    &cache.embeddings,
    &cache.norms,
    &cache.tombstones,
);
let results = top_k(&scores, 5);

for (idx, score) in &results {
    println!("[{:.3}] {}", score, cache.chunks[*idx]);
}

Hybrid search (vector + BM25)

use edgehdf5_memory::bm25::BM25Index;
use edgehdf5_memory::hybrid::hybrid_search;

let bm25 = BM25Index::build(&cache.chunks, &cache.tombstones);

let results = hybrid_search(
    &query_embedding,
    "user UI preferences dark mode",
    &cache.embeddings,
    &cache.chunks,
    &cache.tombstones,
    &bm25,
    0.7,   // vector weight
    0.3,   // keyword weight
    5,     // top-k
);

Adaptive strategy (auto-selects the best backend)

use edgehdf5_memory::strategy::{auto_select_strategy, search_with_metrics, HardwareCapabilities};

let hw = HardwareCapabilities {
    rayon_available: cfg!(feature = "parallel"),
    gpu_available: false,
    blas_available: cfg!(feature = "fast-math"),
    accelerate_available: cfg!(feature = "accelerate"),
};

let strategy = auto_select_strategy(cache.embeddings.len(), &hw);
// Returns: Scalar | SimdBruteForce | Blas | Accelerate | RayonParallel | Gpu | IvfPq

let (results, metrics) = search_with_metrics(
    &query, &cache.embeddings, &cache.norms, &cache.tombstones,
    10, strategy, &mut None,
);
println!("Strategy: {}, Time: {}µs", metrics.strategy, metrics.search_time_us);

Knowledge graph

let entity_a = memory.add_entity("Alice", "person", -1)?;
let entity_b = memory.add_entity("ProjectX", "project", -1)?;
memory.add_relation(entity_a, entity_b, "works_on", 1.0)?;

let relations = memory.knowledge().get_relations_from(entity_a);

Async usage (tokio)

use edgehdf5_memory::async_memory::{AsyncHDF5Memory, AsyncConfig};
use std::time::Duration;

let config = AsyncConfig {
    flush_interval: Duration::from_secs(10),
    flush_threshold: 100,
};

let mem = AsyncHDF5Memory::open_with("agent_memory.h5", config).await?;

// Saves are buffered and batched by a background writer task
mem.save(entry).await?;
mem.save_batch(entries).await?;

// Searches are offloaded to spawn_blocking
let results = mem.hybrid_search(embedding, "query".into(), 0.7, 0.3, 5).await;

// Graceful shutdown flushes remaining writes
mem.shutdown().await?;

The async wrapper keeps the synchronous HDF5Memory core on a background thread, batches saves through an mpsc channel, and auto-flushes on a configurable interval or when pending WAL entries exceed a threshold.

Performance

Benchmarked on MacBook Pro M3 Max, 384-dimensional embeddings:

Backend	1K vectors	10K vectors	100K vectors	Notes
Scalar (baseline)	42µs	410µs	4.1ms	No SIMD, no dependencies
SIMD brute-force	18µs	175µs	1.7ms	rustyhdf5-accel auto-dispatch
Apple Accelerate (cblas)	15µs	157µs	1.5ms	AMX coprocessor via Accelerate.framework
BLAS (matrixmultiply)	17µs	168µs	1.6ms	Cross-platform, no system libs
Rayon parallel	35µs	120µs	980µs	Scales with core count
GPU (wgpu)	200µs	190µs	650µs	Amortized; wins at scale
IVF-PQ	N/A (overhead > brute-force)	850µs	380µs	6.2× faster than numpy; index build amortized

Adaptive strategy thresholds:

Collection Size	Auto-Selected Strategy
< 1,000	Scalar brute-force
1K – 10K	SIMD brute-force (or Accelerate/BLAS if available)
10K – 50K	Rayon parallel (or Accelerate/BLAS)
50K – 500K	GPU (if available) or Accelerate/BLAS
> 500K	IVF-PQ approximate search with exact reranking

Storage efficiency with Product Quantization:

384-dim float32 → 48 bytes per vector (PQ codes) = 8× compression
100K vectors: 146 MB (raw) → 4.6 MB (PQ codes + codebook)

Feature Flags

Feature	Default	Description
`float16`	yes	Half-precision embedding storage via the `half` crate
`parallel`	no	Rayon-based parallel search
`fast-math`	no	BLAS matrix-vector multiply via `matrixmultiply` (cross-platform)
`accelerate`	no	Apple Accelerate framework — cblas_sgemv on AMX (macOS only)
`openblas`	no	OpenBLAS cblas_sgemv (Linux)
`gpu`	no	GPU-accelerated search via wgpu (Metal/Vulkan/DX12)
`async`	no	Tokio-based async wrapper with background flush (batched writes, auto-flush)

Recommended feature sets

# macOS / Apple Silicon (best performance)
features = ["float16", "accelerate", "parallel"]

# Linux server
features = ["float16", "openblas", "parallel"]

# Cross-platform (no system library dependencies)
features = ["float16", "fast-math", "parallel"]

# Maximum acceleration (macOS with GPU)
features = ["float16", "accelerate", "parallel", "gpu"]

# Async agent runtime (tokio)
features = ["float16", "accelerate", "parallel", "async"]

Platform Support

Platform	Status	Optimized Backend
macOS / Apple Silicon	Primary target	Accelerate (AMX), Metal (wgpu)
macOS / Intel	Supported	Accelerate (SSE/AVX), Metal (wgpu)
Linux x86_64	Supported	OpenBLAS, Vulkan (wgpu)
Linux aarch64	Supported	OpenBLAS, Vulkan (wgpu)
Windows x86_64	Supported	matrixmultiply, DX12 (wgpu)

All platforms fall back to scalar or SIMD brute-force when no accelerated backend is compiled in.

Migration from SQLite

The edgehdf5-migrate CLI converts existing SQLite agent memory databases to HDF5.

Install

cargo install --path crates/edgehdf5-migrate

Usage

edgehdf5-migrate \
  --sqlite old_memory.db \
  --hdf5 agent_memory.h5 \
  --agent-id my-agent \
  --embedder openai:text-embedding-3-small \
  --embedding-dim 384 \
  --skip-deleted \
  --compression \
  --compression-level 6 \
  --verbose

Options

Flag	Default	Description
`--sqlite`	(required)	Path to source SQLite database
`--hdf5`	(required)	Path to output HDF5 file
`--agent-id`	`migrated`	Agent identifier to write into `/meta`
`--embedder`	`unknown`	Embedder model name
`--embedding-dim`	(auto-detect)	Embedding dimensionality; auto-detected from first row if omitted
`--skip-deleted`	`false`	Skip rows with `deleted=1`
`--compression`	`false`	Enable deflate compression
`--compression-level`	`4`	Compression level (1–9)
`--float16`	`false`	Store embeddings as float16 (halves storage)
`--dry-run`	`false`	Validate migration without writing the output file
`--verbose`	`false`	Print progress and statistics

Expected SQLite schema

The migration tool reads from these tables:

memory_chunks — chunk TEXT, embedding BLOB, source_channel TEXT, timestamp REAL, session_id TEXT, tags TEXT, deleted INTEGER
sessions — id TEXT, start_idx INTEGER, end_idx INTEGER, channel TEXT, timestamp REAL, summary TEXT
entities — id INTEGER, name TEXT, entity_type TEXT, embedding_idx INTEGER
relations — src INTEGER, tgt INTEGER, relation TEXT, weight REAL, ts REAL

HDF5 File Schema (v1.0)

agent_memory.h5
├── /meta                          (attributes)
│   ├── schema_version: "1.0"
│   ├── edgehdf5_version: "1.93.0"
│   ├── agent_id, embedder, embedding_dim
│   ├── chunk_size, overlap
│   └── created_at
├── /memory
│   ├── chunks:          string[N]       (NullPad encoded)
│   ├── embeddings:      f32[N × D]
│   ├── source_channel:  string[N]
│   ├── timestamps:      f64[N]
│   ├── session_ids:     string[N]
│   ├── tags:            string[N]
│   ├── tombstones:      u8[N]           (0=active, 1=deleted)
│   └── norms:           f32[N]          (pre-computed L2 norms)
├── /sessions
│   ├── ids:             string[S]
│   ├── start_idxs:      i64[S]
│   ├── end_idxs:        i64[S]
│   ├── channels:        string[S]
│   ├── timestamps:      f64[S]
│   └── summaries:       string[S]
└── /knowledge_graph
    ├── entity_ids:       i64[E]
    ├── entity_names:     string[E]
    ├── entity_types:     string[E]
    ├── entity_emb_idxs:  i64[E]
    ├── relation_srcs:    i64[R]
    ├── relation_tgts:    i64[R]
    ├── relation_types:   string[R]
    ├── relation_weights: f32[R]
    └── relation_ts:      f64[R]

Integration Guide

For teams adding EdgeHDF5 to an existing agent

Add the dependency with the feature flags appropriate for your target platform (see Feature Flags).
Create a MemoryConfig with your agent's embedding model and dimensionality.
Use the AgentMemory trait for all memory operations:
- save() / save_batch() — add memories
- delete() — tombstone a memory
- compact() — reclaim space when tombstone fraction exceeds compact_threshold
- snapshot() — create a timestamped backup copy
- add_session() / get_session_summary() — session management
Choose your search path:
- Simple: Use cosine_similarity_batch_prenorm + top_k for brute-force vector search.
- Hybrid: Build a BM25Index and call hybrid_search for combined semantic + keyword retrieval.
- Adaptive: Use auto_select_strategy + search_with_metrics to automatically pick the fastest backend for your collection size.
- Large scale: Build an IVFPQIndex for collections above 100K vectors.
Persist to disk — HDF5Memory uses atomic writes (temp file + rename) so crashes never corrupt the file.
Open existing files with HDF5Memory::open(path) — schema validation is automatic.

Thread safety

HDF5Memory is Send but not Sync. For concurrent access, wrap in Arc<Mutex<HDF5Memory>>, use a single-writer pattern with the snapshot mechanism for readers, or use AsyncHDF5Memory (feature async) which handles concurrency internally via a background writer task.

Error handling

All fallible operations return Result<T, MemoryError>. The error variants are:

MemoryError::Io — file system errors
MemoryError::Hdf5 — HDF5 format or parsing errors
MemoryError::Schema — schema version mismatch or missing fields
MemoryError::NotFound — requested entity/session does not exist

Building

# Default (float16 only)
cargo build -p edgehdf5-memory

# With Apple Accelerate
cargo build -p edgehdf5-memory --features accelerate

# All features (macOS)
cargo build -p edgehdf5-memory --features "float16,accelerate,parallel,gpu"

# Migration CLI
cargo build -p edgehdf5-migrate --release

Running tests

cargo test --workspace

Running benchmarks

cargo bench -p edgehdf5-memory

EdgeHDF5 Integration

EdgeHDF5 is a fully independent library — no external agent framework required.

Import edgehdf5-memory directly into any Rust project
Implement your own embedding pipeline; EdgeHDF5 stores and searches pre-computed vectors
The AgentMemory trait is a simple interface: save, search, delete, compact, snapshot
Works with any embedding model (OpenAI, Cohere, local models, etc.)
The edgehdf5-migrate CLI converts existing SQLite agent memory databases to HDF5

Dependencies

EdgeHDF5 depends on the rustyhdf5 stack for HDF5 I/O:

Crate	Role
`rustyhdf5-format`	HDF5 format definitions
`rustyhdf5`	Core HDF5 read/write
`rustyhdf5-io`	Memory-mapped file I/O
`rustyhdf5-accel`	SIMD acceleration primitives
`rustyhdf5-gpu`	wgpu GPU compute backend (optional)

No C HDF5 library is required — rustyhdf5 is a pure Rust HDF5 implementation.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
crates		crates
skill/edgehdf5-memory		skill/edgehdf5-memory
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

EdgeHDF5

Why HDF5 for Agent Memory?

Architecture

Crate Structure

Quick Start

Add the dependency

Create a memory store and add entries

Search memories

Hybrid search (vector + BM25)

Adaptive strategy (auto-selects the best backend)

Knowledge graph

Async usage (tokio)

Performance

Feature Flags

Recommended feature sets

Platform Support

Migration from SQLite

Install

Usage

Options

Expected SQLite schema

HDF5 File Schema (v1.0)

Integration Guide

For teams adding EdgeHDF5 to an existing agent

Thread safety

Error handling

Building

Running tests

Running benchmarks

EdgeHDF5 Integration

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages