Skip to content

feat(search): embedding-only semantic highlight + native-image dense fix#88

Merged
kdroidFilter merged 1 commit into
masterfrom
feat/semantic-highlight
Jun 26, 2026
Merged

feat(search): embedding-only semantic highlight + native-image dense fix#88
kdroidFilter merged 1 commit into
masterfrom
feat/semantic-highlight

Conversation

@kdroidFilter

Copy link
Copy Markdown
Owner

Summary

  • Embedding-only highlight: the smart find/highlight no longer uses dictionary expansion (scattered, meaning-blind word matches). semanticSpan(query, text) returns the contiguous passage closest in meaning to the query (same v5 encoder as dense search); semanticFind(query, bookId, limit) runs a book-scoped dense KNN for smart find-in-page.
  • Removed buildHighlightTerms (dictionary-based) from the engine.
  • Native-image fix: VectorSearcher now opens the Lucene index with NIOFSDirectory under GraalVM native image (MMapDirectory's MemorySegmentIndexInputProvider uses Panama foreign downcalls that can't be instantiated there → LinkageError), mirroring LuceneSearchEngine. Without this, dense search + highlight were silently disabled in the native binary and the engine fell back to lexical+dictionary.
  • Robustness: fuse() degrades to lexical-only (with a log) instead of throwing when the dense backend fails; denseReady() warms/probes the dense backend.

Test plan

  • :search compiles
  • JVM: dense search + semantic highlight work
  • GraalVM native image: denseReady=true, smart find returns hits, passages highlighted
  • App-side wiring lands via the Zayit submodule bump

- Replace dictionary-based smart highlight with embedding-based semantic
  passages: semanticSpan() picks the passage closest in meaning to the query;
  semanticFind() runs a book-scoped dense KNN for smart find-in-page
- Remove buildHighlightTerms (dictionary expansion) from the engine
- VectorSearcher: open the index with NIOFSDirectory under GraalVM native image
  (MMapDirectory's MemorySegmentIndexInputProvider can't load there), matching
  LuceneSearchEngine — fixes dense search + highlight being silently disabled
- fuse(): degrade to lexical-only (and log) instead of throwing if dense fails
- denseReady() to warm/probe the dense backend
@kdroidFilter kdroidFilter merged commit c16607d into master Jun 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant