Skip to content

[improvement](be) Add stampede protection for AnnIndexIVFListCache#62442

Merged
hello-stephen merged 3 commits intoapache:masterfrom
zhiqiang-hhhh:ivf-cache-stampede-protection
Apr 16, 2026
Merged

[improvement](be) Add stampede protection for AnnIndexIVFListCache#62442
hello-stephen merged 3 commits intoapache:masterfrom
zhiqiang-hhhh:ivf-cache-stampede-protection

Conversation

@zhiqiang-hhhh
Copy link
Copy Markdown
Contributor

Summary

  • Add double-check locking pattern to CachedRandomAccessReader::borrow() to prevent concurrent threads from redundantly reading the same IVF list data from disk (cache stampede)
  • Eliminate ghost memory caused by LRU cache entry replacement under high concurrency, reducing usage ratio fluctuations
  • Zero overhead on the fast path (cache hit); reuses the existing _io_mutex without introducing new synchronization primitives

Proposed Changes

When multiple threads concurrently miss the AnnIndexIVFListCache for the same key, they all independently read the same data from disk and insert it into the cache. The LRU cache silently replaces duplicate entries (counted as "stampede"), causing:

  1. Redundant disk I/O: N concurrent misses produce N identical reads instead of 1
  2. Ghost memory: replaced entries whose handles are still pinned consume memory not tracked by cache _usage, leading to usage ratio fluctuations (observed 87%→50% under high concurrency)
  3. Lower cache hit ratio: unnecessary evictions caused by inflated insertion volume

The fix leverages the existing _io_mutex (required because CLucene IndexInput is stateful) by moving its acquisition before the disk read and adding a cache re-check after acquiring the lock. If a preceding thread already loaded the same key, subsequent threads skip the disk I/O entirely.

What problem does this PR solve?

Problem Summary: Under high concurrency (e.g., vectordb_bench with 10-90 concurrent threads), the AnnIndexIVFListCache exhibits excessive stampede events, redundant disk I/O, and usage ratio fluctuations due to the TOCTOU gap between cache lookup and insert.

Release note

Reduced redundant disk I/O and memory waste for IVF on-disk vector index cache under concurrent queries by adding stampede protection (double-check locking pattern) to CachedRandomAccessReader::borrow().

Check List (For Author)

  • Test: Manual test with vectordb_bench concurrency benchmark
  • Behavior changed: No
  • Does this need documentation: No

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:
When multiple threads concurrently miss the AnnIndexIVFListCache for the
same IVF list key, each thread independently reads the same data from disk
and inserts it into the cache.  The LRU cache silently replaces duplicate
entries (counted as "stampede"), causing:

1. Redundant disk I/O: N concurrent misses produce N identical reads instead of 1.
2. Ghost memory: replaced entries whose handles are still pinned by other
   threads consume memory that is no longer tracked by cache usage, leading
   to cache usage ratio fluctuations (observed 87%→50% under high concurrency).
3. Lower cache hit ratio: unnecessary evictions caused by the inflated
   insertion volume.

### Release note

Reduced redundant disk I/O and memory waste for IVF on-disk vector index
cache under concurrent queries by adding stampede protection (double-check
locking pattern) to CachedRandomAccessReader::borrow().

### Check List (For Author)

- Test: Manual test with vectordb_bench concurrency benchmark
- Behavior changed: No
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zhiqiang-hhhh
Copy link
Copy Markdown
Contributor Author

run buildall

@zhiqiang-hhhh zhiqiang-hhhh marked this pull request as ready for review April 13, 2026 09:43
@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.05% (20145/37975)
Line Coverage 36.59% (189433/517721)
Region Coverage 32.88% (147232/447730)
Branch Coverage 33.99% (64403/189462)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.65% (27388/37189)
Line Coverage 57.31% (295779/516137)
Region Coverage 54.55% (246507/451862)
Branch Coverage 56.17% (106742/190044)

…ampede protection

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: The stampede protection double-check path in
CachedRandomAccessReader::borrow() (lines 372-375) had 0% test coverage.
Multiple threads concurrently missing the fast-path cache lookup would
all acquire the I/O mutex, but only the first thread performs disk I/O;
the remaining threads should hit the cache on re-check. This path was
never exercised by any existing test.

### Release note

None

### Check List (For Author)

- Test: Unit Test / Integration Test
    - Cache-level unit tests (AnnIndexIVFListCacheTest): basic ops, miss paths,
      multi-entry, stampede simulation (single & multi-key), singleton lifecycle
    - Integration tests (VectorSearchTest): IVF_ON_DISK save/load/search and
      concurrent search with 8 threads to exercise the actual borrow()
      double-check cache hit path
- Behavior changed: No
- Does this need documentation: No

\ud83d\udc98 Generated with Crush

Assisted-by: Claude Opus 4.6 via Crush <crush@charm.land>
@zhiqiang-hhhh
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.20% (20199/37969)
Line Coverage 36.68% (189892/517682)
Region Coverage 32.96% (147564/447719)
Branch Coverage 34.06% (64522/189423)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 57.14% (4/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.61% (27370/37183)
Line Coverage 57.25% (295463/516098)
Region Coverage 54.32% (245443/451851)
Branch Coverage 56.09% (106577/190005)

### What problem does this PR solve?

Problem Summary: The concurrent stampede protection test created separate
FaissVectorIndex instances per thread, each with its own
CachedRandomAccessReader and _io_mutex. This meant threads never
contended on the same mutex, so the double-check cache hit path
(faiss_ann_index.cpp:373-375) was never exercised.

Fix: all threads now share a single FaissVectorIndex instance so they
contend on the same _io_mutex, allowing the double-check lookup to find
entries cached by a preceding thread.

### Release note

None

### Check List (For Author)

- Test: Unit Test
- Behavior changed: No
- Does this need documentation: No

\ud83d\udc98 Generated with Crush

Assisted-by: Claude Opus 4.6 via Crush <crush@charm.land>
@zhiqiang-hhhh
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (7/7) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.57% (27375/37209)
Line Coverage 57.26% (295767/516491)
Region Coverage 54.63% (247033/452210)
Branch Coverage 56.21% (106909/190196)

@zhiqiang-hhhh
Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
"overall": "No blocking issues found. The patch correctly narrows the cache stampede window in CachedRandomAccessReader::borrow() by rechecking the shared AnnIndexIVFListCache under the existing _io_mutex, and the added tests are focused on the affected behavior.",
"findings": [],
"critical_checkpoints": {
"goal_and_proof": "Pass. The goal is to avoid redundant IVF_ON_DISK list reads for concurrent misses on the same reader and add coverage for that path. borrow() now rechecks the cache while holding _io_mutex before doing disk I/O, and targeted BE unit and integration tests were added.",
"minimality": "Pass. The functional change is localized to the miss path in CachedRandomAccessReader plus focused cache and ANN tests.",
"concurrency": "Pass. The concurrent actors are query threads sharing one CachedRandomAccessReader. The stateful CLucene IndexInput remains protected by _io_mutex. The added locked work is only a second cache lookup before the existing read path, and I did not find lock order or deadlock issues.",
"lifecycle_and_static_init": "Pass. Cache handle ownership remains RAII-based through PageCacheHandle, cache values still inherit LRUCacheValueBase, and the change introduces no new static initialization dependency.",
"configuration": "Not applicable. No new config was added.",
"compatibility": "Not applicable. No protocol, symbol, or storage format change was introduced.",
"parallel_paths": "Pass. The change applies to the IVF_ON_DISK RandomAccessReader miss path, which is the relevant parallel read path for this cache behavior.",
"conditional_checks": "Pass. The double-check branch is documented in code and matches the surrounding CLucene synchronization model.",
"test_coverage": "Pass with note. Added cache-level unit tests and IVF_ON_DISK integration tests. The public macOS BE UT failure is an environment issue (runner JDK 25 vs required JDK 17), not a code failure from this PR.",
"observability": "Pass. Existing hit, miss, and fetch-page metrics are preserved and remain sufficient for diagnosing this path.",
"transaction_and_persistence": "Not applicable. No transaction, persistence, or edit log logic was changed.",
"data_writes": "Not applicable. This PR only changes the read-side on-disk IVF cache path.",
"fe_be_variable_passing": "Not applicable. No FE/BE interface changes were introduced.",
"performance": "Pass. The cache-hit fast path stays lock-free, and the miss path now avoids redundant disk reads and duplicate cache insertions under contention."
},
"residual_risks": [
"The coverage is BE-unit focused. There is no regression-test suite change because the behavior is internal to the BE IVF_ON_DISK cache path."
],
"decision": "approve"
}

@hello-stephen hello-stephen reopened this Apr 16, 2026
@hello-stephen hello-stephen merged commit 4952cad into apache:master Apr 16, 2026
45 of 47 checks passed
github-actions Bot pushed a commit that referenced this pull request Apr 16, 2026
…62442)

## Summary

- Add double-check locking pattern to
`CachedRandomAccessReader::borrow()` to prevent concurrent threads from
redundantly reading the same IVF list data from disk (cache stampede)
- Eliminate ghost memory caused by LRU cache entry replacement under
high concurrency, reducing usage ratio fluctuations
- Zero overhead on the fast path (cache hit); reuses the existing
`_io_mutex` without introducing new synchronization primitives

## Proposed Changes

When multiple threads concurrently miss the `AnnIndexIVFListCache` for
the same key, they all independently read the same data from disk and
insert it into the cache. The LRU cache silently replaces duplicate
entries (counted as "stampede"), causing:

1. **Redundant disk I/O**: N concurrent misses produce N identical reads
instead of 1
2. **Ghost memory**: replaced entries whose handles are still pinned
consume memory not tracked by cache `_usage`, leading to usage ratio
fluctuations (observed 87%→50% under high concurrency)
3. **Lower cache hit ratio**: unnecessary evictions caused by inflated
insertion volume

The fix leverages the existing `_io_mutex` (required because CLucene
`IndexInput` is stateful) by moving its acquisition before the disk read
and adding a cache re-check after acquiring the lock. If a preceding
thread already loaded the same key, subsequent threads skip the disk I/O
entirely.

### What problem does this PR solve?

Problem Summary: Under high concurrency (e.g., vectordb_bench with 10-90
concurrent threads), the AnnIndexIVFListCache exhibits excessive
stampede events, redundant disk I/O, and usage ratio fluctuations due to
the TOCTOU gap between cache lookup and insert.

### Release note

Reduced redundant disk I/O and memory waste for IVF on-disk vector index
cache under concurrent queries by adding stampede protection
(double-check locking pattern) to CachedRandomAccessReader::borrow().

### Check List (For Author)

- Test: Manual test with vectordb_bench concurrency benchmark
- Behavior changed: No
- Does this need documentation: No
yiguolei pushed a commit that referenced this pull request Apr 16, 2026
…ListCache #62442 (#62567)

Cherry-picked from #62442

Co-authored-by: zhiqiang <seuhezhiqiang@163.com>
@zhiqiang-hhhh zhiqiang-hhhh deleted the ivf-cache-stampede-protection branch April 17, 2026 02:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants