fix: support llama.cpp as direct backend without Lemonade#684
fix: support llama.cpp as direct backend without Lemonade#684
Conversation
- LemonadeClient: don't append /api/v1 when base_url already has a path (e.g. http://localhost:8081/v1 stays as-is, bare URLs still get /api/v1) - LemonadeClient: skip _ensure_model_loaded for non-Lemonade backends (detected by absence of all_models_loaded/model_loaded in /health) - System status: fallback to /v1/models for llama.cpp-style model detection when /health doesn't report model_loaded Fixes 404 errors when using Gaia Agent UI with llama.cpp directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude please review this PR and make sure it's backwards compatible. |
## Summary - Fix `fatal: a branch named 'main' already exists` error when reviewing fork PRs where the contributor's head branch is `main` (their default branch) - Fix corrupted diff base — after redirecting origin to the fork, `origin/$BASE_BRANCH` gets overwritten, so save the base SHA before any URL changes - Use fixed branch name `pr-head` with `-B` (force-create) instead of `-b $HEAD_REF` to avoid name collisions ## Root Cause When a fork PR uses `main` as its head branch ([PR #684](#684)): 1. `actions/checkout` creates local `main` branch from the base repo 2. `git checkout -b main origin/main` fails because `main` already exists 3. Even if that succeeded, `git fetch origin main` (from fork) would overwrite `origin/main`, making the diff compare fork-vs-fork instead of base-vs-fork Failed run: https://github.com/amd/gaia/actions/runs/23859641383/job/69562473226 ## Test plan - [ ] Trigger Claude review on [PR #684](#684) (fork PR with `main` as head branch) — should succeed - [ ] Trigger Claude review on a non-fork PR — should still work as before - [ ] Verify the generated diff in logs shows correct base-vs-fork comparison Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@claude what changes are needed in this PR to get all tests in CI to pass? |
|
Claude finished @kovtcharov's task —— View job CI Test Failure Analysis
SummaryThe PR introduces two changes to
Required Test FixesThe existing unit tests in Tests that need updating:
Fix Required:Add @patch.object(LemonadeClient, "health_check")
@patch.object(LemonadeClient, "get_status")
@patch.object(LemonadeClient, "load_model")
def test_calls_load_when_model_not_loaded(self, mock_load, mock_status, mock_health):
# Mock health_check to return Lemonade-style response (so it continues to get_status)
mock_health.return_value = {"status": "ok", "model_loaded": "some-model"}
# ... rest of test unchangedAdditional Test Needed:Add a new test to verify the non-Lemonade backend detection works: @patch.object(LemonadeClient, "health_check")
@patch.object(LemonadeClient, "get_status")
@patch.object(LemonadeClient, "load_model")
def test_skips_load_for_non_lemonade_backend(self, mock_load, mock_status, mock_health):
"""Verify _ensure_model_loaded skips load for non-Lemonade backends (llama.cpp)."""
client = LemonadeClient(host="localhost", port=8000)
# Return llama.cpp-style health response (no model_loaded field)
mock_health.return_value = {"status": "ok"}
client._ensure_model_loaded("model-a", auto_download=True)
# Should NOT call get_status or load_model for non-Lemonade backends
mock_status.assert_not_called()
mock_load.assert_not_called()Backwards Compatibility ✅The changes are backwards compatible:
|
Tests were missing health_check() mock after _ensure_model_loaded was updated to detect non-Lemonade backends via /health response. Fixes: - Add @patch.object(LemonadeClient, "health_check") to 6 existing tests - Add mock_health parameter and Lemonade-style return values - Add TestNonLemonadeBackend class with llama.cpp detection test Per CI analysis by @kovtcharov Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This test returns early before health_check is called (auto_download=False), so the extra mock parameter caused a positional argument mismatch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
itomek
left a comment
There was a problem hiding this comment.
Thanks for the PR, @stampby! One ask before merge: a parametrized test for the __init__ URL normalization. Something like:
@pytest.mark.parametrize("base_url,expected", [
("http://localhost:8000", "http://localhost:8000/api/v1"), # bare → appended
("http://localhost:8081/v1", "http://localhost:8081/v1"), # llama.cpp — the fix
("http://localhost:8000/api/v1", "http://localhost:8000/api/v1"), # already normalized
("http://proxy.example.com/lemonade", ???), # see question below
])
def test_base_url_normalization(base_url, expected):
assert LemonadeClient(base_url=base_url).base_url == expected
Summary
/api/v1whenbase_urlalready has a path (e.g.http://localhost:8081/v1stays as-is). Bare URLs without a path still get/api/v1appended for Lemonade compatibility._ensure_model_loadedfor non-Lemonade backends, detected by absence ofall_models_loaded/model_loadedfields in/healthresponse./v1/modelsfor llama.cpp-style model detection when/healthdoesn't reportmodel_loaded. Checks both{"models": [...]}(llama.cpp) and{"data": [...]}(OpenAI) formats.Problem
When using
gaia chat --ui --base-url http://localhost:8081/v1with llama.cpp directly (no Lemonade), three issues caused 404 errors:base_urlnormalization unconditionally appended/api/v1, turninghttp://localhost:8081/v1intohttp://localhost:8081/v1/api/v1— all API calls hit wrong paths_ensure_model_loadedcalled Lemonade-specificload_modelendpoint which doesn't exist on llama.cpp/healthreturns{"status":"ok"}withoutmodel_loadedfieldTest plan
gaia chat --ui --base-url http://localhost:8081/v1 --no-lemonade-checkworks with llama.cppgaia chat --uiwith default Lemonade backend still works (no regression)--base-url http://localhost:8080still gets/api/v1appended/v1/models🤖 Generated with Claude Code