Version
26.5.0
Which installation method(s) does this occur on?
No response
Describe the bug.
retriever query currently uses vLLM for doing query embedding, which is much higher latency than HuggingFace
Minimum reproducible example
Relevant log output
Other/Misc.
No response
Version
26.5.0
Which installation method(s) does this occur on?
No response
Describe the bug.
retriever querycurrently uses vLLM for doing query embedding, which is much higher latency than HuggingFaceMinimum reproducible example
Relevant log output
Other/Misc.
No response