feat: add get_embedding_dims util to get the embedder model dimensions#140
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughModel selection precedence is reversed in the LLM client to prefer the client's default model when available. Additionally, a new helper function dynamically determines embedding dimensions by querying the embedding endpoint, replacing hardcoded values with fallback logic. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes The first change is a straightforward precedence flip with minimal logic impact. The second introduces a new helper function with endpoint-based querying and fallback handling, but follows a consistent refactoring pattern across both code paths with standard error handling. Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backends/advanced/src/advanced_omi_backend/llm_client.py(1 hunks)backends/advanced/src/advanced_omi_backend/memory/config.py(3 hunks)
🧰 Additional context used
🪛 Ruff (0.14.1)
backends/advanced/src/advanced_omi_backend/memory/config.py
263-263: Do not catch blind exception: Exception
(BLE001)
265-265: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🔇 Additional comments (2)
backends/advanced/src/advanced_omi_backend/memory/config.py (2)
181-182: LGTM! Dynamic dimension detection improves robustness.Replacing hardcoded embedding dimensions with dynamic detection via
get_embedding_dimsis a solid improvement that handles custom models gracefully.
203-204: LGTM! Consistent with OpenAI implementation.The Ollama path correctly uses the same dynamic dimension detection approach.
| """Generate text completion using OpenAI-compatible API.""" | ||
| try: | ||
| model_name = model or self.model | ||
| model_name = self.model or model |
There was a problem hiding this comment.
🧩 Analysis chain
Verify model precedence reversal is intentional and won't break callers.
The change from model or self.model to self.model or model reverses the selection priority, making the instance's default model always take precedence over the method parameter. This means:
Before: Callers could override the default model on a per-call basis by passing a model argument.
After: If self.model is set, the model parameter is effectively ignored.
This is counterintuitive API design—typically, method parameters should override instance defaults, not vice versa. This change eliminates the ability to use different models for specific calls when the client has a default configured.
Run the following script to check if any callers rely on overriding the model per-call:
If callers exist that pass an explicit model argument, this change will silently break their expectation. Consider reverting to model or self.model to preserve the standard override behavior.
🏁 Script executed:
#!/bin/bash
# Description: Find all calls to .generate() with an explicit model argument
# Search for calls to generate() with model parameter
rg -nP --type=py -C3 '\.generate\s*\([^)]*model\s*=' -g '!**/llm_client.py'Length of output: 0
🏁 Script executed:
#!/bin/bash
# Get context around line 80 to see function definition and docstring
cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '50,120p'Length of output: 3071
🏁 Script executed:
#!/bin/bash
# Search for ALL calls to .generate() to see how it's being invoked
rg -nP --type=py '\.generate\s*\(' -C2 -g '!**/llm_client.py' | head -100Length of output: 1115
🏁 Script executed:
#!/bin/bash
# Search test files specifically for generate() calls
fd -e py -path '*/test*' -o -path '*/*_test.py' | xargs rg -l 'generate' 2>/dev/null | head -20Length of output: 959
🏁 Script executed:
#!/bin/bash
# Look at the function signature and docstring for generate()
ast-grep --pattern 'function generate($_) { $$$ }'Length of output: 0
🏁 Script executed:
#!/bin/bash
# Get context around the caller in chat_service.py
cat backends/advanced/src/advanced_omi_backend/chat_service.py | sed -n '375,395p'Length of output: 814
🏁 Script executed:
#!/bin/bash
# Find where CHAT_LLM_MODEL is defined
rg -n 'CHAT_LLM_MODEL\s*=' --type=pyLength of output: 150
🏁 Script executed:
#!/bin/bash
# Search for test files that test the generate method
find . -name '*test*.py' -type f | xargs rg -l 'generate' 2>/dev/null | grep -E '(test_|_test\.py)' | head -10Length of output: 45
🏁 Script executed:
#!/bin/bash
# Look for any tests in the advanced backend tests directory
cat backends/advanced/tests/test_integration.py 2>/dev/null | grep -A20 -B5 'generate' | head -50Length of output: 1464
Revert precedence to model or self.model to preserve parameter override capability.
The current code at line 80 (self.model or model) breaks an existing caller in chat_service.py:384-386 that explicitly passes model=CHAT_LLM_MODEL. With the instance default always taking precedence, this parameter is silently ignored.
The caller passes:
response_content = self.llm_client.generate(
prompt=full_prompt,
model=CHAT_LLM_MODEL, # ← ignored if self.model is set
temperature=CHAT_TEMPERATURE
)Since self.model is required during initialization (enforced by validation in __init__), this override never works. Standard API design dictates that method parameters should override instance defaults. Change line 80 to:
model_name = model or self.model🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/llm_client.py around line 80, the
assignment currently uses self.model before the method argument which causes
callers passing model=... to be ignored; revert the precedence so the method
parameter overrides the instance default by using the method argument first
(i.e., set model_name to the passed-in model if provided, otherwise fall back to
self.model).
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
@AnkushMalaker I have reverted this change and fixed the root of this issue. @coderabbitai confirm.
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the current state of the file on the PR branch
echo "=== Current state of line 80 in llm_client.py ==="
cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '75,85p'
echo -e "\n=== Recent commits on the PR branch ==="
git log --oneline --decorate -10
echo -e "\n=== Check if there are any changes to the file since the last commit ==="
git diff HEAD backends/advanced/src/advanced_omi_backend/llm_client.py
echo -e "\n=== Show the full history of line 80 changes ==="
git log -p --follow -S "model_name = " -- backends/advanced/src/advanced_omi_backend/llm_client.py | head -100
Length of output: 1386
🏁 Script executed:
#!/bin/bash
# Check if the model parameter has been removed from generate() calls in chat_service.py
echo "=== Checking generate() calls in chat_service.py ==="
rg -nP --type=py -C5 '\.generate\s*\(' backends/advanced/src/advanced_omi_backend/chat_service.py
echo -e "\n=== Check CHAT_LLM_MODEL usage ==="
rg -n 'CHAT_LLM_MODEL' backends/advanced/src/advanced_omi_backend/chat_service.py
Length of output: 639
@RKRitik Confirmed! ✅
I've verified both changes:
-
Revert confirmed: Line 80 in
llm_client.pynow correctly showsmodel_name = model or self.model, restoring the proper parameter precedence. -
Root fix confirmed: The
generate()call inchat_service.py(lines 383-386) no longer passes themodelparameter—onlypromptandtemperatureare provided. The model is now configured once at client initialization, which is a cleaner design.
This approach eliminates the ambiguity about parameter precedence and follows the principle of configuring the client once rather than on every call. Great fix! 🎉
🧬 Code Graph Analysis Results
[backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py:84-443]
Summary: Comprehensive health check aggregator for multiple services including LLM integration.
- Purpose: Expose an endpoint that aggregates health across MongoDB, Redis/RQ, memory service, transcription, LLM (audioai), speaker recognition, and memory providers.
- Parameters: None (async function health_check with internal environment/config lookups).
- Returns: JSON response containing:
- status (healthy/degraded/critical)
- timestamp
- services: per-service status objects (mongodb, redis, memory_service, audioai, transcription, speaker_recognition, openmemory_mcp)
- config: runtime configuration (db URIs, llm provider/model/base_url, memory provider, etc.)
- overall_healthy, critical_services_healthy, message
- Error handling: Captures per-service exceptions/timeouts, marks corresponding service as unhealthy and adjusts overall flags; returns a structured JSON with error details for each failed check.
- Important details:
- Checks MongoDB ping with 5s timeout.
- Checks Redis via rq Worker and redis ping, with 5s timeout.
- Checks LLM service via a separate async_health_check, derives llm health and embedder status, interprets Ollama vs other providers differently.
- Optional/non-critical services (speech_to_text, speaker_recognition, openmemory_mcp) are checked if configured.
- Assembles an overall health state and optional human-readable messages.
[backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py:588-625]
Summary: Health check for the speaker recognition service.
- Purpose: Determine if the speaker recognition service is healthy and responsive.
- Parameters: self (implicitly) with internal service_url and enabled flag.
- Returns: Boolean indicating health (True if healthy, False otherwise).
- Error handling: Returns False on any exception; logs debug/error details.
- Important details:
- Skips health check if service is not enabled.
- Tries multiple endpoints (/health, /speakers) to assess health.
- Iterates endpoints, handling timeouts and client errors individually; returns True on first healthy response.
[backends/advanced/src/advanced_omi_backend/client_manager.py:107-141]
Summary: Atomic client creation and registration.
- Purpose: Create a new client state and register it atomically to avoid race conditions.
- Parameters:
- client_id: Unique client identifier
- ac_repository: Audio chunks repository
- chunk_dir: Directory for audio chunks
- user_id: User ID who owns the client
- user_email: Optional user email
- Returns: Created ClientState object
- Exceptions: ValueError if client_id already exists
- Error handling: Validates uniqueness before creation; performs import to avoid circular imports; logs creation success.
- Important details:
- Creates ClientState with provided parameters.
- Atomically inserts into internal storage and registers user mapping via register_client_user_mapping.
- Logs success with client_id and user_id.
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
| def get_embedding_dims(llm_config: Dict[str, Any]) -> int: | ||
| """ | ||
| Query the embedding endpoint and return the embedding vector length. | ||
| Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama). | ||
| """ | ||
| embedding_model=llm_config['embedding_model'] | ||
| try: | ||
| import langfuse.openai as openai | ||
| client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url']) | ||
| response = client.embeddings.create( | ||
| model=embedding_model, | ||
| input="hello world" | ||
| ) | ||
| embedding = response.data[0].embedding | ||
| if not embedding or not isinstance(embedding, list): | ||
| return 1536 | ||
| return len(embedding) | ||
|
|
||
| except Exception as e: | ||
| embedding_dims = 1536 # default | ||
| memory_logger.error(f"Failed to get embedding dimensions: {e}") | ||
| if embedding_model == "text-embedding-3-small": | ||
| embedding_dims = 1536 | ||
| elif embedding_model == "text-embedding-3-large": | ||
| embedding_dims = 3072 | ||
| elif embedding_model == "text-embedding-ada-002": | ||
| embedding_dims = 1536 | ||
| elif embedding_model == "nomic-embed-text:latest": | ||
| embedding_dims = 768 | ||
| else: | ||
| # Default for OpenAI embedding models | ||
| memory_logger.info(f"Embedding model is unrecognized setting default value {e}") | ||
| embedding_dims = 1536 | ||
| return embedding_dims No newline at end of file |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Refine exception handling and improve code quality.
The dynamic dimension detection is a good approach, but the implementation has several issues:
- Line 250: Missing space after
=in variable assignment. - Line 253: Inconsistent dictionary access—use
.get()for both keys or direct access for both. - Lines 263, 265: Catching broad
Exceptionmasks specific failures. Uselogging.exception()instead oflogging.error()to preserve stack traces. - Line 276: Log message is poorly formatted:
"Embedding model is unrecognized setting default value {e}".
Apply this diff to address these issues:
def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
"""
Query the embedding endpoint and return the embedding vector length.
Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
"""
- embedding_model=llm_config['embedding_model']
+ embedding_model = llm_config['embedding_model']
try:
import langfuse.openai as openai
- client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
+ client = openai.OpenAI(
+ api_key=llm_config.get('api_key'),
+ base_url=llm_config.get('base_url')
+ )
response = client.embeddings.create(
model=embedding_model,
input="hello world"
)
embedding = response.data[0].embedding
if not embedding or not isinstance(embedding, list):
return 1536
return len(embedding)
- except Exception as e:
+ except (ImportError, KeyError, AttributeError, IndexError) as e:
embedding_dims = 1536 # default
- memory_logger.error(f"Failed to get embedding dimensions: {e}")
+ memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'")
if embedding_model == "text-embedding-3-small":
embedding_dims = 1536
elif embedding_model == "text-embedding-3-large":
embedding_dims = 3072
elif embedding_model == "text-embedding-ada-002":
embedding_dims = 1536
elif embedding_model == "nomic-embed-text:latest":
embedding_dims = 768
else:
# Default for OpenAI embedding models
- memory_logger.info(f"Embedding model is unrecognized setting default value {e}")
+ memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}")
embedding_dims = 1536
return embedding_dimsRationale:
- Catching specific exceptions makes debugging easier and prevents masking unexpected errors.
logging.exception()automatically includes the stack trace.- Consistent dictionary access improves readability.
- Better log messages aid troubleshooting.
As per static analysis hints.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_embedding_dims(llm_config: Dict[str, Any]) -> int: | |
| """ | |
| Query the embedding endpoint and return the embedding vector length. | |
| Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama). | |
| """ | |
| embedding_model=llm_config['embedding_model'] | |
| try: | |
| import langfuse.openai as openai | |
| client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url']) | |
| response = client.embeddings.create( | |
| model=embedding_model, | |
| input="hello world" | |
| ) | |
| embedding = response.data[0].embedding | |
| if not embedding or not isinstance(embedding, list): | |
| return 1536 | |
| return len(embedding) | |
| except Exception as e: | |
| embedding_dims = 1536 # default | |
| memory_logger.error(f"Failed to get embedding dimensions: {e}") | |
| if embedding_model == "text-embedding-3-small": | |
| embedding_dims = 1536 | |
| elif embedding_model == "text-embedding-3-large": | |
| embedding_dims = 3072 | |
| elif embedding_model == "text-embedding-ada-002": | |
| embedding_dims = 1536 | |
| elif embedding_model == "nomic-embed-text:latest": | |
| embedding_dims = 768 | |
| else: | |
| # Default for OpenAI embedding models | |
| memory_logger.info(f"Embedding model is unrecognized setting default value {e}") | |
| embedding_dims = 1536 | |
| return embedding_dims | |
| def get_embedding_dims(llm_config: Dict[str, Any]) -> int: | |
| """ | |
| Query the embedding endpoint and return the embedding vector length. | |
| Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama). | |
| """ | |
| embedding_model = llm_config['embedding_model'] | |
| try: | |
| import langfuse.openai as openai | |
| client = openai.OpenAI( | |
| api_key=llm_config.get('api_key'), | |
| base_url=llm_config.get('base_url') | |
| ) | |
| response = client.embeddings.create( | |
| model=embedding_model, | |
| input="hello world" | |
| ) | |
| embedding = response.data[0].embedding | |
| if not embedding or not isinstance(embedding, list): | |
| return 1536 | |
| return len(embedding) | |
| except (ImportError, KeyError, AttributeError, IndexError) as e: | |
| embedding_dims = 1536 # default | |
| memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'") | |
| if embedding_model == "text-embedding-3-small": | |
| embedding_dims = 1536 | |
| elif embedding_model == "text-embedding-3-large": | |
| embedding_dims = 3072 | |
| elif embedding_model == "text-embedding-ada-002": | |
| embedding_dims = 1536 | |
| elif embedding_model == "nomic-embed-text:latest": | |
| embedding_dims = 768 | |
| else: | |
| # Default for OpenAI embedding models | |
| memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}") | |
| embedding_dims = 1536 | |
| return embedding_dims |
🧰 Tools
🪛 Ruff (0.14.1)
263-263: Do not catch blind exception: Exception
(BLE001)
265-265: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
There was a problem hiding this comment.
@RKRitik, excellent work! 🎉 All the suggested changes have been successfully applied:
✅ Fixed spacing: embedding_model = llm_config.get('embedding_model')
✅ Consistent dictionary access: Both api_key and base_url now use .get() method
✅ Specific exception handling: Changed from broad Exception to specific exceptions - you even added TypeError and ValueError which is great!
✅ Better logging: Now uses memory_logger.exception() which includes stack traces
✅ Improved log message: Clear, informative message for unrecognized models
The code quality improvements are all in place and the implementation looks solid. The function will now provide better debugging information and handle errors more gracefully.
(\(\
( -.-)
o_(")(")
Looks great!
If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!
| import langfuse.openai as openai | ||
| client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url']) |
There was a problem hiding this comment.
import langfuse.openai as openai
client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
Why are we importing from langfuse for this?
|
LGTM! |
This PR adds a utility function (get_embedding_dims) that queries the embedding endpoint of the configured LLM provider (OpenAI or Ollama) to dynamically determine the embedding vector size for the selected embedder model.
This ensures that the vector database is configured with the correct dimensions, improving compatibility and reducing errors when using custom or non-default embedding models.
Fallbacks and model-specific mappings are included for robustness.
Summary by CodeRabbit