Skip to content

feat(rag): add hybrid search using RRF score fusion#492

Open
nancysangani wants to merge 1 commit into
param20h:devfrom
nancysangani:feat/hybrid-search-rrf
Open

feat(rag): add hybrid search using RRF score fusion#492
nancysangani wants to merge 1 commit into
param20h:devfrom
nancysangani:feat/hybrid-search-rrf

Conversation

@nancysangani
Copy link
Copy Markdown
Contributor

🔗 Related Issue

Closes #440


📝 What does this PR do?

Replaces the fake RRF approximation in retriever.py with a correct
Reciprocal Rank Fusion implementation and removes the EnsembleRetriever
dependency.

backend/app/rag/retriever.py:

  • Adds rrf_merge(vector_results, bm25_results, k) — implements the standard
    RRF formula score(d) = Σ 1/(k + rank) across both ranked lists, deduplicates
    by content key, and returns chunks sorted by descending RRF score.
  • Removes EnsembleRetriever / CustomVectorRetriever / CustomBM25Retriever
    LangChain wrapper classes — query_chunks and query_bm25 are called directly,
    giving full control over each ranked list before fusion.
  • retrieve() now calls embed_queryquery_chunksquery_bm25
    rrf_merge per query variant, then promotes rrf_scorescore before
    passing candidates to the cross-encoder reranker. Existing reranking and
    confidence normalisation logic is unchanged.
  • Falls back to vector-only when USE_HYBRID_SEARCH=False or BM25 raises.

backend/app/config.py:

  • Adds USE_HYBRID_SEARCH: bool = True — toggle hybrid search without
    redeploying.
  • Adds RRF_K: int = 60 — exposes the RRF smoothing constant; 60 is the
    value from the original RRF paper and the standard production default.

🗂️ Type of Change

  • ✨ New feature
  • 🔧 Refactor / code cleanup

🧪 How was this tested?

  • Ran the backend locally (uvicorn app.main:app --reload)
  • Queried a multi-document collection; confirmed RRF scores present on
    returned chunks and that chunks appearing in both lists score higher
    than single-list results
  • Set USE_HYBRID_SEARCH=False; confirmed vector-only path runs and
    query_bm25 is never called
  • Removed rank_bm25 from env; confirmed graceful fallback to
    vector-only via the except guard
  • Confirmed reranker and confidence normalisation are unaffected

✅ Self-Review Checklist

  • My branch is based on dev, not main
  • I have not added any secrets / API keys
  • I have not modified main branch or any HuggingFace deployment config
  • My code follows the existing style (no unnecessary formatting changes)
  • I have updated relevant docs / comments if needed

@nancysangani nancysangani requested a review from param20h as a code owner June 6, 2026 06:40
@nancysangani
Copy link
Copy Markdown
Contributor Author

Hi @param20h, I have opened this PR to fix the issue #440. Please review it when you get a chance. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(rag): Add hybrid search merging vector and BM25 scores via RRF

1 participant