make opening index size inspection O(1) by zehiko · Pull Request #26 · rerun-io/lance

zehiko · 2026-06-18T15:38:38Z

Problem

LanceIndexStore::deep_size_of_children counted object_store + index_dir + metadata_cache. object_store and metadata_cache are Arcs to process-wide shared infrastructure, not data the index owns.

A cached BTreeIndex reaches its LanceIndexStore (and thus metadata_cache) via Arc, so moka's byte-weigher — which calls deep_size_of() on every index-cache insert — walked the entire shared metadata cache to weigh a single index. Two consequences:

Cost: each scalar-index open became O(metadata-cache size) (LanceCache::deep_size_of_children → approx_size_bytes() iterates every entry). On a large shared cache this dominates CPU.
Mis-weighting: every index entry was charged ≈ the whole metadata cache, inflating and equalizing index-cache entry weights, so size-based eviction evicted index objects almost immediately and re-opened them on nearly every query.

In production this showed up as open_scalar_index on essentially every query, each paying the O(metadata-cache) walk.

Fix

Count only data the store owns: index_dir + file_sizes. This mirrors BTreeIndex::deep_size_of_children, which already skips its index_cache field to avoid circular references.

It deliberately does not touch approx_size_bytes / LanceCache::deep_size_of_children (no weighted_size), so it carries no telemetry-accuracy tradeoff — the per-open path simply never reaches the cache.

Testing done

Microbenchmark timing the real LanceIndexStore::deep_size_of() vs metadata-cache fill N (release-7.0.0 vs this branch, same harness):

N (metadata entries)	release-7.0.0	this fix
1	5.6 µs	~1 ns
1,000	91 µs	~1 ns
10,000	767 µs	~1 ns
100,000	19.5 ms	~1 ns

O(N) → O(1), independent of cache fill.

stack testing - WIP

github-actions · 2026-06-18T15:38:59Z

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

LanceIndexStore::deep_size_of_children counted object_store + metadata_cache, both Arc-shared process-wide infrastructure, not data the index owns. A cached BTreeIndex reaches this store via Arc, so every index-cache entry was charged the entire shared metadata cache: - mis-weighting: inflated/equalized index-cache entry weights so size-based eviction evicted index objects immediately and re-opened them every query (the production open_scalar_index churn); - cost: each index-cache insert walked the metadata cache (approx_size_bytes iterates all entries) -> O(metadata-cache size) per open. Count only owned data (index_dir + file_sizes). Mirrors BTreeIndex, which already skips its index_cache field.

zehiko · 2026-06-19T08:27:44Z

This path is no longer relevant after Lance lance-format#6222, see details in https://linear.app/rerun/issue/RR-4918/maven-slow-query-investigation

github-actions Bot added the bug Something isn't working label Jun 18, 2026

zehiko marked this pull request as draft June 18, 2026 15:40

zehiko force-pushed the zehiko/deepsize-fix-release-7.0.0 branch from 531c10d to 416dd77 Compare June 18, 2026 15:43

zehiko changed the title ~~fix(index): exclude shared infra from LanceIndexStore DeepSizeOf (O(metadata-cache)→O(1) per scalar-index open)~~ make opening index size inspection O(1) Jun 18, 2026

zehiko closed this Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

make opening index size inspection O(1)#26

make opening index size inspection O(1)#26
zehiko wants to merge 1 commit into
release-7.0.0from
zehiko/deepsize-fix-release-7.0.0

zehiko commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

zehiko commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

zehiko commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Testing done

Uh oh!

github-actions Bot commented Jun 18, 2026

Uh oh!

zehiko commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zehiko commented Jun 18, 2026 •

edited

Loading