Skip to content

feat: add get_embedding_dims util to get the embedder model dimensions#140

Merged
AnkushMalaker merged 3 commits into
SimpleOpenSoftware:mainfrom
RKRitik:fix/ollama-embedding-dims
Oct 24, 2025
Merged

feat: add get_embedding_dims util to get the embedder model dimensions#140
AnkushMalaker merged 3 commits into
SimpleOpenSoftware:mainfrom
RKRitik:fix/ollama-embedding-dims

Conversation

@RKRitik

@RKRitik RKRitik commented Oct 24, 2025

Copy link
Copy Markdown
Contributor

This PR adds a utility function (get_embedding_dims) that queries the embedding endpoint of the configured LLM provider (OpenAI or Ollama) to dynamically determine the embedding vector size for the selected embedder model.
This ensures that the vector database is configured with the correct dimensions, improving compatibility and reducing errors when using custom or non-default embedding models.
Fallbacks and model-specific mappings are included for robustness.

Summary by CodeRabbit

  • Enhancements
    • Improved model selection to better prioritize default settings
    • Enhanced embedding configuration detection with more robust fallback support

@coderabbitai

coderabbitai Bot commented Oct 24, 2025

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Model selection precedence is reversed in the LLM client to prefer the client's default model when available. Additionally, a new helper function dynamically determines embedding dimensions by querying the embedding endpoint, replacing hardcoded values with fallback logic.

Changes

Cohort / File(s) Summary
Model Name Precedence Swap
backends/advanced/src/advanced_omi_backend/llm_client.py
Changed model selection logic from model or self.model to self.model or model to prioritize the client's default model over the per-call argument.
Embedding Dimension Determination
backends/advanced/src/advanced_omi_backend/memory/config.py
Introduced get_embedding_dims() helper function to query embedding endpoints dynamically; replaced hardcoded embedding dimension logic for OpenAI and Ollama with calls to this function; added logging and error handling with fallback defaults.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

The first change is a straightforward precedence flip with minimal logic impact. The second introduces a new helper function with endpoint-based querying and fallback handling, but follows a consistent refactoring pattern across both code paths with standard error handling.

Poem

🐰 The models now bow to their defaults so true,
While embeddings bloom from endpoints anew,
With fallbacks in place when queries go shy,
Our rabbit hops forth with a satisfied sigh! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat: add get_embedding_dims util to get the embedder model dimensions" clearly and specifically relates to the primary change in the changeset. The main substantive change involves introducing a new helper function get_embedding_dims and refactoring the embedding dimension logic to use this utility across multiple paths (OpenAI and Ollama). While the PR also includes a secondary change to model precedence in llm_client.py, the title accurately captures the core feature being added. The title is concise, follows conventional commit format, and would allow teammates scanning history to quickly understand that this PR introduces a new embedding dimension utility.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RKRitik

RKRitik commented Oct 24, 2025

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Oct 24, 2025

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 059dfd0 and 71e6089.

📒 Files selected for processing (2)
  • backends/advanced/src/advanced_omi_backend/llm_client.py (1 hunks)
  • backends/advanced/src/advanced_omi_backend/memory/config.py (3 hunks)
🧰 Additional context used
🪛 Ruff (0.14.1)
backends/advanced/src/advanced_omi_backend/memory/config.py

263-263: Do not catch blind exception: Exception

(BLE001)


265-265: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (2)
backends/advanced/src/advanced_omi_backend/memory/config.py (2)

181-182: LGTM! Dynamic dimension detection improves robustness.

Replacing hardcoded embedding dimensions with dynamic detection via get_embedding_dims is a solid improvement that handles custom models gracefully.


203-204: LGTM! Consistent with OpenAI implementation.

The Ollama path correctly uses the same dynamic dimension detection approach.

"""Generate text completion using OpenAI-compatible API."""
try:
model_name = model or self.model
model_name = self.model or model

@coderabbitai coderabbitai Bot Oct 24, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify model precedence reversal is intentional and won't break callers.

The change from model or self.model to self.model or model reverses the selection priority, making the instance's default model always take precedence over the method parameter. This means:

Before: Callers could override the default model on a per-call basis by passing a model argument.
After: If self.model is set, the model parameter is effectively ignored.

This is counterintuitive API design—typically, method parameters should override instance defaults, not vice versa. This change eliminates the ability to use different models for specific calls when the client has a default configured.

Run the following script to check if any callers rely on overriding the model per-call:

If callers exist that pass an explicit model argument, this change will silently break their expectation. Consider reverting to model or self.model to preserve the standard override behavior.


🏁 Script executed:

#!/bin/bash
# Description: Find all calls to .generate() with an explicit model argument

# Search for calls to generate() with model parameter
rg -nP --type=py -C3 '\.generate\s*\([^)]*model\s*=' -g '!**/llm_client.py'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Get context around line 80 to see function definition and docstring

cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '50,120p'

Length of output: 3071


🏁 Script executed:

#!/bin/bash
# Search for ALL calls to .generate() to see how it's being invoked

rg -nP --type=py '\.generate\s*\(' -C2 -g '!**/llm_client.py' | head -100

Length of output: 1115


🏁 Script executed:

#!/bin/bash
# Search test files specifically for generate() calls

fd -e py -path '*/test*' -o -path '*/*_test.py' | xargs rg -l 'generate' 2>/dev/null | head -20

Length of output: 959


🏁 Script executed:

#!/bin/bash
# Look at the function signature and docstring for generate()

ast-grep --pattern 'function generate($_) { $$$ }'

Length of output: 0


🏁 Script executed:

#!/bin/bash
# Get context around the caller in chat_service.py

cat backends/advanced/src/advanced_omi_backend/chat_service.py | sed -n '375,395p'

Length of output: 814


🏁 Script executed:

#!/bin/bash
# Find where CHAT_LLM_MODEL is defined

rg -n 'CHAT_LLM_MODEL\s*=' --type=py

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Search for test files that test the generate method

find . -name '*test*.py' -type f | xargs rg -l 'generate' 2>/dev/null | grep -E '(test_|_test\.py)' | head -10

Length of output: 45


🏁 Script executed:

#!/bin/bash
# Look for any tests in the advanced backend tests directory

cat backends/advanced/tests/test_integration.py 2>/dev/null | grep -A20 -B5 'generate' | head -50

Length of output: 1464


Revert precedence to model or self.model to preserve parameter override capability.

The current code at line 80 (self.model or model) breaks an existing caller in chat_service.py:384-386 that explicitly passes model=CHAT_LLM_MODEL. With the instance default always taking precedence, this parameter is silently ignored.

The caller passes:

response_content = self.llm_client.generate(
    prompt=full_prompt,
    model=CHAT_LLM_MODEL,  # ← ignored if self.model is set
    temperature=CHAT_TEMPERATURE
)

Since self.model is required during initialization (enforced by validation in __init__), this override never works. Standard API design dictates that method parameters should override instance defaults. Change line 80 to:

model_name = model or self.model
🤖 Prompt for AI Agents
In backends/advanced/src/advanced_omi_backend/llm_client.py around line 80, the
assignment currently uses self.model before the method argument which causes
callers passing model=... to be ignored; revert the precedence so the method
parameter overrides the instance default by using the method argument first
(i.e., set model_name to the passed-in model if provided, otherwise fall back to
self.model).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RKRitik bruh is coderabbitai correct here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AnkushMalaker I have reverted this change and fixed the root of this issue. @coderabbitai confirm.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the current state of the file on the PR branch
echo "=== Current state of line 80 in llm_client.py ==="
cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '75,85p'

echo -e "\n=== Recent commits on the PR branch ==="
git log --oneline --decorate -10

echo -e "\n=== Check if there are any changes to the file since the last commit ==="
git diff HEAD backends/advanced/src/advanced_omi_backend/llm_client.py

echo -e "\n=== Show the full history of line 80 changes ==="
git log -p --follow -S "model_name = " -- backends/advanced/src/advanced_omi_backend/llm_client.py | head -100

Length of output: 1386


🏁 Script executed:

#!/bin/bash
# Check if the model parameter has been removed from generate() calls in chat_service.py
echo "=== Checking generate() calls in chat_service.py ==="
rg -nP --type=py -C5 '\.generate\s*\(' backends/advanced/src/advanced_omi_backend/chat_service.py

echo -e "\n=== Check CHAT_LLM_MODEL usage ==="
rg -n 'CHAT_LLM_MODEL' backends/advanced/src/advanced_omi_backend/chat_service.py

Length of output: 639


@RKRitik Confirmed! ✅

I've verified both changes:

  1. Revert confirmed: Line 80 in llm_client.py now correctly shows model_name = model or self.model, restoring the proper parameter precedence.

  2. Root fix confirmed: The generate() call in chat_service.py (lines 383-386) no longer passes the model parameter—only prompt and temperature are provided. The model is now configured once at client initialization, which is a cleaner design.

This approach eliminates the ambiguity about parameter precedence and follows the principle of configuring the client once rather than on every call. Great fix! 🎉


🧬 Code Graph Analysis Results

[backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py:84-443]
Summary: Comprehensive health check aggregator for multiple services including LLM integration.

  • Purpose: Expose an endpoint that aggregates health across MongoDB, Redis/RQ, memory service, transcription, LLM (audioai), speaker recognition, and memory providers.
  • Parameters: None (async function health_check with internal environment/config lookups).
  • Returns: JSON response containing:
    • status (healthy/degraded/critical)
    • timestamp
    • services: per-service status objects (mongodb, redis, memory_service, audioai, transcription, speaker_recognition, openmemory_mcp)
    • config: runtime configuration (db URIs, llm provider/model/base_url, memory provider, etc.)
    • overall_healthy, critical_services_healthy, message
  • Error handling: Captures per-service exceptions/timeouts, marks corresponding service as unhealthy and adjusts overall flags; returns a structured JSON with error details for each failed check.
  • Important details:
    • Checks MongoDB ping with 5s timeout.
    • Checks Redis via rq Worker and redis ping, with 5s timeout.
    • Checks LLM service via a separate async_health_check, derives llm health and embedder status, interprets Ollama vs other providers differently.
    • Optional/non-critical services (speech_to_text, speaker_recognition, openmemory_mcp) are checked if configured.
    • Assembles an overall health state and optional human-readable messages.

[backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py:588-625]
Summary: Health check for the speaker recognition service.

  • Purpose: Determine if the speaker recognition service is healthy and responsive.
  • Parameters: self (implicitly) with internal service_url and enabled flag.
  • Returns: Boolean indicating health (True if healthy, False otherwise).
  • Error handling: Returns False on any exception; logs debug/error details.
  • Important details:
    • Skips health check if service is not enabled.
    • Tries multiple endpoints (/health, /speakers) to assess health.
    • Iterates endpoints, handling timeouts and client errors individually; returns True on first healthy response.

[backends/advanced/src/advanced_omi_backend/client_manager.py:107-141]
Summary: Atomic client creation and registration.

  • Purpose: Create a new client state and register it atomically to avoid race conditions.
  • Parameters:
    • client_id: Unique client identifier
    • ac_repository: Audio chunks repository
    • chunk_dir: Directory for audio chunks
    • user_id: User ID who owns the client
    • user_email: Optional user email
  • Returns: Created ClientState object
  • Exceptions: ValueError if client_id already exists
  • Error handling: Validates uniqueness before creation; performs import to avoid circular imports; logs creation success.
  • Important details:
    • Creates ClientState with provided parameters.
    • Atomically inserts into internal storage and registers user mapping via register_client_user_mapping.
    • Logs success with client_id and user_id.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +245 to +278
def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
"""
Query the embedding endpoint and return the embedding vector length.
Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
"""
embedding_model=llm_config['embedding_model']
try:
import langfuse.openai as openai
client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
response = client.embeddings.create(
model=embedding_model,
input="hello world"
)
embedding = response.data[0].embedding
if not embedding or not isinstance(embedding, list):
return 1536
return len(embedding)

except Exception as e:
embedding_dims = 1536 # default
memory_logger.error(f"Failed to get embedding dimensions: {e}")
if embedding_model == "text-embedding-3-small":
embedding_dims = 1536
elif embedding_model == "text-embedding-3-large":
embedding_dims = 3072
elif embedding_model == "text-embedding-ada-002":
embedding_dims = 1536
elif embedding_model == "nomic-embed-text:latest":
embedding_dims = 768
else:
# Default for OpenAI embedding models
memory_logger.info(f"Embedding model is unrecognized setting default value {e}")
embedding_dims = 1536
return embedding_dims No newline at end of file

@coderabbitai coderabbitai Bot Oct 24, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Refine exception handling and improve code quality.

The dynamic dimension detection is a good approach, but the implementation has several issues:

  1. Line 250: Missing space after = in variable assignment.
  2. Line 253: Inconsistent dictionary access—use .get() for both keys or direct access for both.
  3. Lines 263, 265: Catching broad Exception masks specific failures. Use logging.exception() instead of logging.error() to preserve stack traces.
  4. Line 276: Log message is poorly formatted: "Embedding model is unrecognized setting default value {e}".

Apply this diff to address these issues:

 def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
     """
     Query the embedding endpoint and return the embedding vector length.
     Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
     """
-    embedding_model=llm_config['embedding_model']
+    embedding_model = llm_config['embedding_model']
     try:
         import langfuse.openai as openai
-        client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
+        client = openai.OpenAI(
+            api_key=llm_config.get('api_key'), 
+            base_url=llm_config.get('base_url')
+        )
         response = client.embeddings.create(
             model=embedding_model,
             input="hello world"
         )
         embedding = response.data[0].embedding
         if not embedding or not isinstance(embedding, list):
             return 1536
         return len(embedding)
 
-    except Exception as e:
+    except (ImportError, KeyError, AttributeError, IndexError) as e:
         embedding_dims = 1536 # default
-        memory_logger.error(f"Failed to get embedding dimensions: {e}")
+        memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'")
         if embedding_model == "text-embedding-3-small":
             embedding_dims = 1536
         elif embedding_model == "text-embedding-3-large":
             embedding_dims = 3072
         elif embedding_model == "text-embedding-ada-002":
             embedding_dims = 1536
         elif embedding_model == "nomic-embed-text:latest":
             embedding_dims = 768
         else:
             # Default for OpenAI embedding models
-            memory_logger.info(f"Embedding model is unrecognized setting default value {e}")
+            memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}")
             embedding_dims = 1536
         return embedding_dims

Rationale:

  • Catching specific exceptions makes debugging easier and prevents masking unexpected errors.
  • logging.exception() automatically includes the stack trace.
  • Consistent dictionary access improves readability.
  • Better log messages aid troubleshooting.

As per static analysis hints.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
"""
Query the embedding endpoint and return the embedding vector length.
Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
"""
embedding_model=llm_config['embedding_model']
try:
import langfuse.openai as openai
client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
response = client.embeddings.create(
model=embedding_model,
input="hello world"
)
embedding = response.data[0].embedding
if not embedding or not isinstance(embedding, list):
return 1536
return len(embedding)
except Exception as e:
embedding_dims = 1536 # default
memory_logger.error(f"Failed to get embedding dimensions: {e}")
if embedding_model == "text-embedding-3-small":
embedding_dims = 1536
elif embedding_model == "text-embedding-3-large":
embedding_dims = 3072
elif embedding_model == "text-embedding-ada-002":
embedding_dims = 1536
elif embedding_model == "nomic-embed-text:latest":
embedding_dims = 768
else:
# Default for OpenAI embedding models
memory_logger.info(f"Embedding model is unrecognized setting default value {e}")
embedding_dims = 1536
return embedding_dims
def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
"""
Query the embedding endpoint and return the embedding vector length.
Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
"""
embedding_model = llm_config['embedding_model']
try:
import langfuse.openai as openai
client = openai.OpenAI(
api_key=llm_config.get('api_key'),
base_url=llm_config.get('base_url')
)
response = client.embeddings.create(
model=embedding_model,
input="hello world"
)
embedding = response.data[0].embedding
if not embedding or not isinstance(embedding, list):
return 1536
return len(embedding)
except (ImportError, KeyError, AttributeError, IndexError) as e:
embedding_dims = 1536 # default
memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'")
if embedding_model == "text-embedding-3-small":
embedding_dims = 1536
elif embedding_model == "text-embedding-3-large":
embedding_dims = 3072
elif embedding_model == "text-embedding-ada-002":
embedding_dims = 1536
elif embedding_model == "nomic-embed-text:latest":
embedding_dims = 768
else:
# Default for OpenAI embedding models
memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}")
embedding_dims = 1536
return embedding_dims
🧰 Tools
🪛 Ruff (0.14.1)

263-263: Do not catch blind exception: Exception

(BLE001)


265-265: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai Please verify the changes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RKRitik, excellent work! 🎉 All the suggested changes have been successfully applied:

Fixed spacing: embedding_model = llm_config.get('embedding_model')
Consistent dictionary access: Both api_key and base_url now use .get() method
Specific exception handling: Changed from broad Exception to specific exceptions - you even added TypeError and ValueError which is great!
Better logging: Now uses memory_logger.exception() which includes stack traces
Improved log message: Clear, informative message for unrecognized models

The code quality improvements are all in place and the implementation looks solid. The function will now provide better debugging information and handle errors more gracefully.

 (\(\
 ( -.-)
 o_(")(")
Looks great!

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +252 to +253
import langfuse.openai as openai
client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        import langfuse.openai as openai
        client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])

Why are we importing from langfuse for this?

@AnkushMalaker

Copy link
Copy Markdown
Collaborator

LGTM!

@AnkushMalaker AnkushMalaker merged commit 8fe109d into SimpleOpenSoftware:main Oct 24, 2025
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants