feat: add get_embedding_dims util to get the embedder model dimensions by RKRitik · Pull Request #140 · SimpleOpenSoftware/chronicle

RKRitik · 2025-10-24T07:05:50Z

This PR adds a utility function (get_embedding_dims) that queries the embedding endpoint of the configured LLM provider (OpenAI or Ollama) to dynamically determine the embedding vector size for the selected embedder model.
This ensures that the vector database is configured with the correct dimensions, improving compatibility and reducing errors when using custom or non-default embedding models.
Fallbacks and model-specific mappings are included for robustness.

Summary by CodeRabbit

Enhancements
- Improved model selection to better prioritize default settings
- Enhanced embedding configuration detection with more robust fallback support

coderabbitai · 2025-10-24T07:06:03Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Model selection precedence is reversed in the LLM client to prefer the client's default model when available. Additionally, a new helper function dynamically determines embedding dimensions by querying the embedding endpoint, replacing hardcoded values with fallback logic.

Changes

Cohort / File(s)	Summary
Model Name Precedence Swap `backends/advanced/src/advanced_omi_backend/llm_client.py`	Changed model selection logic from `model or self.model` to `self.model or model` to prioritize the client's default model over the per-call argument.
Embedding Dimension Determination `backends/advanced/src/advanced_omi_backend/memory/config.py`	Introduced `get_embedding_dims()` helper function to query embedding endpoints dynamically; replaced hardcoded embedding dimension logic for OpenAI and Ollama with calls to this function; added logging and error handling with fallback defaults.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

The first change is a straightforward precedence flip with minimal logic impact. The second introduces a new helper function with endpoint-based querying and fallback handling, but follows a consistent refactoring pattern across both code paths with standard error handling.

Poem

🐰 The models now bow to their defaults so true,
While embeddings bloom from endpoints anew,
With fallbacks in place when queries go shy,
Our rabbit hops forth with a satisfied sigh! ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The pull request title "feat: add get_embedding_dims util to get the embedder model dimensions" clearly and specifically relates to the primary change in the changeset. The main substantive change involves introducing a new helper function `get_embedding_dims` and refactoring the embedding dimension logic to use this utility across multiple paths (OpenAI and Ollama). While the PR also includes a secondary change to model precedence in `llm_client.py`, the title accurately captures the core feature being added. The title is concise, follows conventional commit format, and would allow teammates scanning history to quickly understand that this PR introduces a new embedding dimension utility.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

RKRitik · 2025-10-24T07:59:05Z

@coderabbitai review

coderabbitai · 2025-10-24T07:59:18Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 059dfd0 and 71e6089.

📒 Files selected for processing (2)

backends/advanced/src/advanced_omi_backend/llm_client.py (1 hunks)
backends/advanced/src/advanced_omi_backend/memory/config.py (3 hunks)

🧰 Additional context used

🪛 Ruff (0.14.1)

backends/advanced/src/advanced_omi_backend/memory/config.py

263-263: Do not catch blind exception: Exception

(BLE001)

265-265: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🔇 Additional comments (2)

backends/advanced/src/advanced_omi_backend/memory/config.py (2)

181-182: LGTM! Dynamic dimension detection improves robustness.

Replacing hardcoded embedding dimensions with dynamic detection via get_embedding_dims is a solid improvement that handles custom models gracefully.

203-204: LGTM! Consistent with OpenAI implementation.

The Ollama path correctly uses the same dynamic dimension detection approach.

coderabbitai · 2025-10-24T08:02:08Z

        """Generate text completion using OpenAI-compatible API."""
        try:
-            model_name = model or self.model
+            model_name = self.model or model


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify model precedence reversal is intentional and won't break callers.

The change from model or self.model to self.model or model reverses the selection priority, making the instance's default model always take precedence over the method parameter. This means:

Before: Callers could override the default model on a per-call basis by passing a model argument.
After: If self.model is set, the model parameter is effectively ignored.

This is counterintuitive API design—typically, method parameters should override instance defaults, not vice versa. This change eliminates the ability to use different models for specific calls when the client has a default configured.

Run the following script to check if any callers rely on overriding the model per-call:

If callers exist that pass an explicit model argument, this change will silently break their expectation. Consider reverting to model or self.model to preserve the standard override behavior.

🏁 Script executed:

#!/bin/bash # Description: Find all calls to .generate() with an explicit model argument # Search for calls to generate() with model parameter rg -nP --type=py -C3 '\.generate\s*\([^)]*model\s*=' -g '!**/llm_client.py'

Length of output: 0

🏁 Script executed:

#!/bin/bash # Get context around line 80 to see function definition and docstring cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '50,120p'

Length of output: 3071

🏁 Script executed:

#!/bin/bash # Search for ALL calls to .generate() to see how it's being invoked rg -nP --type=py '\.generate\s*\(' -C2 -g '!**/llm_client.py' | head -100

Length of output: 1115

🏁 Script executed:

#!/bin/bash # Search test files specifically for generate() calls fd -e py -path '*/test*' -o -path '*/*_test.py' | xargs rg -l 'generate' 2>/dev/null | head -20

Length of output: 959

🏁 Script executed:

#!/bin/bash # Look at the function signature and docstring for generate() ast-grep --pattern 'function generate($_) { $$$ }'

Length of output: 0

🏁 Script executed:

#!/bin/bash # Get context around the caller in chat_service.py cat backends/advanced/src/advanced_omi_backend/chat_service.py | sed -n '375,395p'

Length of output: 814

🏁 Script executed:

#!/bin/bash # Find where CHAT_LLM_MODEL is defined rg -n 'CHAT_LLM_MODEL\s*=' --type=py

Length of output: 150

🏁 Script executed:

#!/bin/bash # Search for test files that test the generate method find . -name '*test*.py' -type f | xargs rg -l 'generate' 2>/dev/null | grep -E '(test_|_test\.py)' | head -10

Length of output: 45

🏁 Script executed:

#!/bin/bash # Look for any tests in the advanced backend tests directory cat backends/advanced/tests/test_integration.py 2>/dev/null | grep -A20 -B5 'generate' | head -50

Length of output: 1464

Revert precedence to model or self.model to preserve parameter override capability.

The current code at line 80 (self.model or model) breaks an existing caller in chat_service.py:384-386 that explicitly passes model=CHAT_LLM_MODEL. With the instance default always taking precedence, this parameter is silently ignored.

The caller passes:

response_content = self.llm_client.generate( prompt=full_prompt, model=CHAT_LLM_MODEL, # ← ignored if self.model is set temperature=CHAT_TEMPERATURE )

Since self.model is required during initialization (enforced by validation in __init__), this override never works. Standard API design dictates that method parameters should override instance defaults. Change line 80 to:

model_name = model or self.model

🤖 Prompt for AI Agents

In backends/advanced/src/advanced_omi_backend/llm_client.py around line 80, the assignment currently uses self.model before the method argument which causes callers passing model=... to be ignored; revert the precedence so the method parameter overrides the instance default by using the method argument first (i.e., set model_name to the passed-in model if provided, otherwise fall back to self.model).

@RKRitik bruh is coderabbitai correct here?

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@AnkushMalaker I have reverted this change and fixed the root of this issue. @coderabbitai confirm.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check the current state of the file on the PR branch echo "=== Current state of line 80 in llm_client.py ===" cat backends/advanced/src/advanced_omi_backend/llm_client.py | sed -n '75,85p' echo -e "\n=== Recent commits on the PR branch ===" git log --oneline --decorate -10 echo -e "\n=== Check if there are any changes to the file since the last commit ===" git diff HEAD backends/advanced/src/advanced_omi_backend/llm_client.py echo -e "\n=== Show the full history of line 80 changes ===" git log -p --follow -S "model_name = " -- backends/advanced/src/advanced_omi_backend/llm_client.py | head -100

Length of output: 1386

🏁 Script executed:

#!/bin/bash # Check if the model parameter has been removed from generate() calls in chat_service.py echo "=== Checking generate() calls in chat_service.py ===" rg -nP --type=py -C5 '\.generate\s*\(' backends/advanced/src/advanced_omi_backend/chat_service.py echo -e "\n=== Check CHAT_LLM_MODEL usage ===" rg -n 'CHAT_LLM_MODEL' backends/advanced/src/advanced_omi_backend/chat_service.py

Length of output: 639

@RKRitik Confirmed! ✅

I've verified both changes:

Revert confirmed: Line 80 in llm_client.py now correctly shows model_name = model or self.model, restoring the proper parameter precedence.

Root fix confirmed: The generate() call in chat_service.py (lines 383-386) no longer passes the model parameter—only prompt and temperature are provided. The model is now configured once at client initialization, which is a cleaner design.

This approach eliminates the ambiguity about parameter precedence and follows the principle of configuring the client once rather than on every call. Great fix! 🎉

🧬 Code Graph Analysis Results

[backends/advanced/src/advanced_omi_backend/routers/modules/health_routes.py:84-443]
Summary: Comprehensive health check aggregator for multiple services including LLM integration.

Purpose: Expose an endpoint that aggregates health across MongoDB, Redis/RQ, memory service, transcription, LLM (audioai), speaker recognition, and memory providers.

Parameters: None (async function health_check with internal environment/config lookups).

Returns: JSON response containing:

status (healthy/degraded/critical)

timestamp

services: per-service status objects (mongodb, redis, memory_service, audioai, transcription, speaker_recognition, openmemory_mcp)

config: runtime configuration (db URIs, llm provider/model/base_url, memory provider, etc.)

overall_healthy, critical_services_healthy, message

Error handling: Captures per-service exceptions/timeouts, marks corresponding service as unhealthy and adjusts overall flags; returns a structured JSON with error details for each failed check.

Important details:

Checks MongoDB ping with 5s timeout.

Checks Redis via rq Worker and redis ping, with 5s timeout.

Checks LLM service via a separate async_health_check, derives llm health and embedder status, interprets Ollama vs other providers differently.

Optional/non-critical services (speech_to_text, speaker_recognition, openmemory_mcp) are checked if configured.

Assembles an overall health state and optional human-readable messages.

[backends/advanced/src/advanced_omi_backend/speaker_recognition_client.py:588-625]
Summary: Health check for the speaker recognition service.

Purpose: Determine if the speaker recognition service is healthy and responsive.

Parameters: self (implicitly) with internal service_url and enabled flag.

Returns: Boolean indicating health (True if healthy, False otherwise).

Error handling: Returns False on any exception; logs debug/error details.

Important details:

Skips health check if service is not enabled.

Tries multiple endpoints (/health, /speakers) to assess health.

Iterates endpoints, handling timeouts and client errors individually; returns True on first healthy response.

[backends/advanced/src/advanced_omi_backend/client_manager.py:107-141]
Summary: Atomic client creation and registration.

Purpose: Create a new client state and register it atomically to avoid race conditions.

Parameters:

client_id: Unique client identifier

ac_repository: Audio chunks repository

chunk_dir: Directory for audio chunks

user_id: User ID who owns the client

user_email: Optional user email

Returns: Created ClientState object

Exceptions: ValueError if client_id already exists

Error handling: Validates uniqueness before creation; performs import to avoid circular imports; logs creation success.

Important details:

Creates ClientState with provided parameters.

Atomically inserts into internal storage and registers user mapping via register_client_user_mapping.

Logs success with client_id and user_id.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

coderabbitai · 2025-10-24T08:02:09Z

+def get_embedding_dims(llm_config: Dict[str, Any]) -> int:
+    """
+    Query the embedding endpoint and return the embedding vector length.
+    Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).
+    """
+    embedding_model=llm_config['embedding_model']
+    try:
+        import langfuse.openai as openai
+        client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])
+        response = client.embeddings.create(
+            model=embedding_model,
+            input="hello world"
+        )
+        embedding = response.data[0].embedding
+        if not embedding or not isinstance(embedding, list):
+            return 1536
+        return len(embedding)
+
+    except Exception as e:
+        embedding_dims = 1536 # default
+        memory_logger.error(f"Failed to get embedding dimensions: {e}")
+        if embedding_model == "text-embedding-3-small":
+            embedding_dims = 1536
+        elif embedding_model == "text-embedding-3-large":
+            embedding_dims = 3072
+        elif embedding_model == "text-embedding-ada-002":
+            embedding_dims = 1536
+        elif embedding_model == "nomic-embed-text:latest":
+            embedding_dims = 768
+        else:
+            # Default for OpenAI embedding models
+            memory_logger.info(f"Embedding model is unrecognized setting default value {e}")
+            embedding_dims = 1536
+        return embedding_dims


🛠️ Refactor suggestion | 🟠 Major

Refine exception handling and improve code quality.

The dynamic dimension detection is a good approach, but the implementation has several issues:

Line 250: Missing space after = in variable assignment.

Line 253: Inconsistent dictionary access—use .get() for both keys or direct access for both.

Lines 263, 265: Catching broad Exception masks specific failures. Use logging.exception() instead of logging.error() to preserve stack traces.

Line 276: Log message is poorly formatted: "Embedding model is unrecognized setting default value {e}".

Apply this diff to address these issues:

def get_embedding_dims(llm_config: Dict[str, Any]) -> int: """ Query the embedding endpoint and return the embedding vector length. Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama). """ - embedding_model=llm_config['embedding_model'] + embedding_model = llm_config['embedding_model'] try: import langfuse.openai as openai - client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url']) + client = openai.OpenAI( + api_key=llm_config.get('api_key'), + base_url=llm_config.get('base_url') + ) response = client.embeddings.create( model=embedding_model, input="hello world" ) embedding = response.data[0].embedding if not embedding or not isinstance(embedding, list): return 1536 return len(embedding) - except Exception as e: + except (ImportError, KeyError, AttributeError, IndexError) as e: embedding_dims = 1536 # default - memory_logger.error(f"Failed to get embedding dimensions: {e}") + memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'") if embedding_model == "text-embedding-3-small": embedding_dims = 1536 elif embedding_model == "text-embedding-3-large": embedding_dims = 3072 elif embedding_model == "text-embedding-ada-002": embedding_dims = 1536 elif embedding_model == "nomic-embed-text:latest": embedding_dims = 768 else: # Default for OpenAI embedding models - memory_logger.info(f"Embedding model is unrecognized setting default value {e}") + memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}") embedding_dims = 1536 return embedding_dims

Rationale:

Catching specific exceptions makes debugging easier and prevents masking unexpected errors.

logging.exception() automatically includes the stack trace.

Consistent dictionary access improves readability.

Better log messages aid troubleshooting.

As per static analysis hints.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def get_embedding_dims(llm_config: Dict[str, Any]) -> int:

"""

Query the embedding endpoint and return the embedding vector length.

Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).

"""

embedding_model=llm_config['embedding_model']

try:

import langfuse.openai as openai

client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])

response = client.embeddings.create(

model=embedding_model,

input="hello world"

)

embedding = response.data[0].embedding

if not embedding or not isinstance(embedding, list):

return 1536

return len(embedding)

except Exception as e:

embedding_dims = 1536 # default

memory_logger.error(f"Failed to get embedding dimensions: {e}")

if embedding_model == "text-embedding-3-small":

embedding_dims = 1536

elif embedding_model == "text-embedding-3-large":

embedding_dims = 3072

elif embedding_model == "text-embedding-ada-002":

embedding_dims = 1536

elif embedding_model == "nomic-embed-text:latest":

embedding_dims = 768

else:

# Default for OpenAI embedding models

memory_logger.info(f"Embedding model is unrecognized setting default value {e}")

embedding_dims = 1536

return embedding_dims

def get_embedding_dims(llm_config: Dict[str, Any]) -> int:

"""

Query the embedding endpoint and return the embedding vector length.

Works for OpenAI and OpenAI-compatible endpoints (e.g., Ollama).

"""

embedding_model = llm_config['embedding_model']

try:

import langfuse.openai as openai

client = openai.OpenAI(

api_key=llm_config.get('api_key'),

base_url=llm_config.get('base_url')

)

response = client.embeddings.create(

model=embedding_model,

input="hello world"

)

embedding = response.data[0].embedding

if not embedding or not isinstance(embedding, list):

return 1536

return len(embedding)

except (ImportError, KeyError, AttributeError, IndexError) as e:

embedding_dims = 1536 # default

memory_logger.exception(f"Failed to get embedding dimensions for model '{embedding_model}'")

if embedding_model == "text-embedding-3-small":

embedding_dims = 1536

elif embedding_model == "text-embedding-3-large":

embedding_dims = 3072

elif embedding_model == "text-embedding-ada-002":

embedding_dims = 1536

elif embedding_model == "nomic-embed-text:latest":

embedding_dims = 768

else:

# Default for OpenAI embedding models

memory_logger.info(f"Unrecognized embedding model '{embedding_model}', using default dimension {embedding_dims}")

embedding_dims = 1536

return embedding_dims

🧰 Tools

🪛 Ruff (0.14.1)

263-263: Do not catch blind exception: Exception

(BLE001)

265-265: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

@coderabbitai Please verify the changes

@RKRitik, excellent work! 🎉 All the suggested changes have been successfully applied:

✅ Fixed spacing: embedding_model = llm_config.get('embedding_model')
✅ Consistent dictionary access: Both api_key and base_url now use .get() method
✅ Specific exception handling: Changed from broad Exception to specific exceptions - you even added TypeError and ValueError which is great!
✅ Better logging: Now uses memory_logger.exception() which includes stack traces
✅ Improved log message: Clear, informative message for unrecognized models

The code quality improvements are all in place and the implementation looks solid. The function will now provide better debugging information and handle errors more gracefully.

(\(\ ( -.-) o_(")(") Looks great!

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

AnkushMalaker · 2025-10-24T14:02:09Z

+        import langfuse.openai as openai
+        client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])


import langfuse.openai as openai client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])

Why are we importing from langfuse for this?

…t the time of client creation

AnkushMalaker · 2025-10-24T15:10:41Z

LGTM!

feat: add get_embedding_dims util to get the embedder model dimensions

71e6089

RKRitik mentioned this pull request Oct 24, 2025

Hardcoded Ollama embedding dims #139

Closed

coderabbitai Bot reviewed Oct 24, 2025

View reviewed changes

AnkushMalaker reviewed Oct 24, 2025

View reviewed changes

RKRitik added 2 commits October 24, 2025 20:15

fix: remove model when generating client response; it is configured a…

92cdfa0

…t the time of client creation

chore: proper exception handling and formatting in get_embedding_dims

390db08

AnkushMalaker merged commit 8fe109d into SimpleOpenSoftware:main Oct 24, 2025
3 of 4 checks passed

		import langfuse.openai as openai
		client = openai.OpenAI(api_key=llm_config.get('api_key'), base_url=llm_config['base_url'])

Conversation

RKRitik commented Oct 24, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

RKRitik commented Oct 24, 2025

Uh oh!

coderabbitai Bot commented Oct 24, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AnkushMalaker Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

RKRitik Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RKRitik Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

AnkushMalaker Oct 24, 2025

Choose a reason for hiding this comment

Uh oh!

AnkushMalaker commented Oct 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RKRitik commented Oct 24, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Oct 24, 2025 •

edited

Loading

coderabbitai Bot Oct 24, 2025 •

edited

Loading

coderabbitai Bot Oct 24, 2025 •

edited

Loading