Skip to content

[Feature] Support local OpenAI-compatible endpoint (Ollama / LM Studio / etc.) #627

@FuJacob

Description

@FuJacob

Summary

Add the option to plug in a local OpenAI-compatible HTTP endpoint as a suggestion backend, so users can point Cotabby at a model already running in Ollama, LM Studio, llama-server, vLLM, or any other server that speaks the OpenAI /v1/completions or /v1/chat/completions API.

Problem

Today the OSS path runs models in-process through the bundled llama.cpp runtime. That's great for zero-config users, but it means:

  • Users who already have a tuned local server (Ollama / LM Studio / llama-server / vLLM) have to download and host the model a second time inside Cotabby.
  • They can't reuse hardware-specific server flags (quant, ctx size, GPU layers, draft model, speculative decode, etc.) that they've already dialed in.
  • Power users can't try models or runtimes that Cotabby doesn't ship support for.

A configurable OpenAI-compatible endpoint sidesteps all of that.

Proposed direction

  • New engine option alongside Apple Intelligence and the bundled llama.cpp runtime: "Local OpenAI-compatible endpoint".
  • Settings fields: base URL (e.g. http://localhost:11434/v1), model name, optional API key, optional completion vs chat-completion mode.
  • Stay on localhost / loopback by default and surface a clear warning if a non-loopback host is entered (this is a privacy-sensitive app).
  • Route through the existing SuggestionEngineRouter as a sibling of the llama path; reuse the base-model prompt rendering and cancellation plumbing.
  • Stream tokens via SSE so cancellation on focus change still works.

Additional context

  • Common targets that already expose this API: Ollama (/v1), LM Studio, llama-server from llama.cpp, vLLM, LocalAI, text-generation-webui.
  • Keep it strictly local-endpoint framing in the UI; this issue is not asking for hosted OpenAI / Anthropic / etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions