openwebui-llm-proxy

Use Codex CLI and Claude Code inside OpenWebUI like OpenAI-compatible models, while keeping a security-research-friendly stack around them.

This repo combines:

OpenAI-compatible proxies for Codex and Claude
Qdrant + FastEmbed vector search for project memory / RAG
OpenWebUI runtime patches for better tool / code-interpreter behavior
A same-origin signed local file viewer for model-generated file references
Utilities to keep long OpenWebUI chats responsive

If you want the convenience of OpenWebUI with the power of CLI-based coding agents, this is the setup.

Highlights

Bring plan-based CLIs into OpenWebUI Codex CLI and Claude Code show up as normal OpenAI-style models, so you can use them from the OpenWebUI interface without swapping to direct API billing.
Streaming that feels native Both proxies translate their native event streams into OpenAI-compatible streaming responses, including reasoning/thinking output when enabled.
Parallel Codex turns Codex requests are no longer forced through a single global queue, so multiple chats can make progress at the same time.
Runtime display controls Turn thinking, reasoning, tool headers, and tool bodies on or off without editing code.
Fast auxiliary chat tasks Keep the main answer on Claude or Codex, while routing title generation, chat tags, and follow-up suggestions to a lighter model.
Signed local file viewer Model outputs can link to local workspace files through signed same-origin URLs, with:
- line anchors
- syntax highlighting
- theme switching
- syntax theme switching
- markdown preview
- copy button
Built for research workflows Vector DB storage, MCP access, OpenWebUI tools, and ingestion scripts make it easy to store and retrieve:
- vulnerability notes
- source code chunks
- exploit/PoC snippets
- bug bounty notes
OpenWebUI UX fixes Includes runtime patches and maintenance scripts for:
- code interpreter truncation issues
- overly heavy historical reasoning/tool artifacts
- backup cleanup

Who is this for?

People using Codex CLI or Claude Code as their primary coding/research interface
Users who like OpenWebUI as a chat frontend
Security researchers who want a lightweight local RAG layer for notes, code, exploits, and reports
Anyone who wants local-file references in chat to open cleanly in-browser instead of breaking into 404s

Architecture

┌────────────────────────────────────────────────────────────────────┐
│                          nginx / same-origin                      │
│  /                → OpenWebUI (:3000)                             │
│  /local-file/*    → local-file-viewer (:8301)                     │
│  /api/tools/*     → Tool API (:8100)                              │
│  /qdrant/*        → Qdrant (:6333)                                │
└───────────────────────────────┬────────────────────────────────────┘
                                │
                ┌───────────────▼────────────────┐
                │        OpenWebUI (Docker)      │
                │  runtime-patched for local-file│
                │  routing + code-interpreter UX │
                └───────────────┬────────────────┘
                                │
              ┌─────────────────┴─────────────────┐
              │                                   │
      ┌───────▼────────┐                 ┌────────▼───────┐
      │ codex-proxy    │                 │ claude-proxy   │
      │ :8200          │                 │ :8201          │
      │ OpenAI-style   │                 │ OpenAI-style   │
      │ SSE bridge     │                 │ SSE bridge     │
      └───────┬────────┘                 └────────┬───────┘
              │                                   │
      ┌───────▼────────┐                 ┌────────▼───────┐
      │ Codex CLI      │                 │ Claude Code    │
      │ app-server     │                 │ stream-json    │
      └────────────────┘                 └────────────────┘

      ┌──────────────────┐              ┌──────────────────┐
      │ Tool API         │─────────────►│ Qdrant           │
      │ FastAPI + MCP    │              │ vector database  │
      │ + ingest/search  │              │ + FastEmbed data │
      └──────────────────┘              └──────────────────┘

      ┌──────────────────┐
      │ local-file-viewer│
      │ signed links     │
      │ syntax highlight │
      │ markdown preview │
      └──────────────────┘

Quick Start

Prerequisites

Node.js 18+
Docker / Docker Compose
Codex CLI
Claude Code

1. Clone and configure

git clone https://github.com/sm1ee/openwebui-llm-proxy.git
cd openwebui-llm-proxy

cp .env.example .env

Minimum recommended .env values:

QDRANT_API_KEY=your-qdrant-api-key
OPENCLAW_WORKSPACE_DIR=/absolute/path/to/your/openclaw/workspace
LOCAL_FILE_URL_PREFIX=/absolute/path/to/your/openclaw/workspace
LOCAL_FILE_VIEWER_BASE_URL=https://ai.yourdomain.com

Useful optional task-routing values:

# Keep OpenWebUI chat UX helpers enabled
ENABLE_TITLE_GENERATION=true
ENABLE_TAGS_GENERATION=true
ENABLE_FOLLOW_UP_GENERATION=true

# Route title / tag / follow-up generation to a faster external model
CHAT_AUX_TASK_MODEL_EXTERNAL=gpt-5.4-mini

2. Start Docker services

docker compose up -d --build

The bundled compose file pins OpenWebUI to v0.8.12 instead of floating on main, so upgrades are explicit and reproducible.

This starts:

open-webui
tool-api
qdrant
local-file-viewer

3. Start the proxies

./start.sh

Or individually:

node codex-proxy.mjs
node claude-proxy.mjs

4. Connect from OpenWebUI

In OpenWebUI, add OpenAI-compatible connections:

Codex: http://host.docker.internal:8200/v1
Claude: http://host.docker.internal:8201/v1
API key: any placeholder string such as dummy

What You Get

1. OpenAI-compatible Codex / Claude endpoints

Both proxies expose:

Endpoint	Method	Description
`/v1/models`	`GET`	Available models
`/v1/chat/completions`	`POST`	Streaming and non-streaming chat completions
`/health`	`GET`	Health check

2. Runtime display control

You can change what OpenWebUI sees without editing source code.

npm run config -- show
npm run config -- fast
npm run config -- balanced
npm run config -- verbose

Examples:

npm run config -- set codex.reasoning on
npm run config -- set codex.toolDisplay on
npm run config -- set codex.toolBodyDisplay off

npm run config -- set claude.thinking off
npm run config -- set claude.toolDisplay on
npm run config -- set claude.toolBodyDisplay off

Main toggles:

codex.reasoning
codex.toolDisplay
codex.toolBodyDisplay
claude.thinking
claude.toolDisplay
claude.toolBodyDisplay

This lets you choose between:

cleaner/faster chat history
verbose debugging visibility
title-only tool traces
full collapsible tool bodies

3. Signed local file viewer

One of the nicest quality-of-life features in this repo.

When a model emits a local workspace file reference, the proxies can rewrite it to a signed same-origin link like:

https://ai.yourdomain.com/local-file/t/<signed-token>#L17

That viewer supports:

line anchors like #L17
syntax highlighting with multiple themes
page theme switching
markdown code/preview toggle
one-click copy
same-origin browser opening instead of dead /Users/... links

Supported link styles:

absolute workspace paths such as /absolute/path/to/workspace/project/report.md
eligible relative workspace references such as project/report.md when they resolve uniquely inside the mounted workspace

By default the viewer also supports direct authenticated browsing when enabled behind OpenWebUI, and signed links have a TTL.

Relevant settings:

Variable	Description
`OPENCLAW_WORKSPACE_DIR`	Host workspace directory mounted read-only into the viewer
`LOCAL_FILE_URL_PREFIX`	Absolute path prefix that should be recognized as local file links
`LOCAL_FILE_VIEWER_BASE_URL`	Public base URL used when generating signed links
`LOCAL_FILE_REQUIRE_ADMIN`	Restrict direct browsing to admins
`LOCAL_FILE_MAX_INLINE_BYTES`	Max inline preview size
`LOCAL_FILE_LINK_TTL_SECONDS`	Signed-link expiration

4. Vector DB for research memory

Qdrant collections are organized per project:

Collection	Purpose
`{project}_vuln`	advisories, CVEs, findings
`{project}_code`	source code chunks
`{project}_exploit`	exploit and PoC data
`{project}_bugbounty`	bug bounty notes / reports

Useful scripts:

python3 scripts/init_qdrant.py --project myproject
python3 scripts/ingest_vuln.py --project myproject --dir /path/to/reports
python3 scripts/ingest_code.py --project myproject --path /path/to/src --lang python

Supported access modes:

OpenWebUI custom tools via openwebui-tools/security_research_tools.py
MCP from Codex / Claude via tool-api/mcp_server.py
Direct CLI ingestion/search through tool-api/vectordb_cli.py

5. MCP read/write tools

The MCP server supports both search and ingest flows.

Read tools:

list_projects
search_vuln
search_code
search_exploit
search_bugbounty

Write tools:

ingest_document
ingest_batch

Collections are created automatically when needed.

5. Chat task routing and toggles

OpenWebUI normally generates a few chat UX helpers after the main assistant reply:

chat title
chat tags
follow-up suggestions

This repo exposes simple environment toggles for those helpers:

Variable	Description
`ENABLE_TITLE_GENERATION`	Enable or disable automatic chat titles
`ENABLE_TAGS_GENERATION`	Enable or disable automatic tag generation
`ENABLE_FOLLOW_UP_GENERATION`	Enable or disable follow-up suggestions
`CHAT_AUX_TASK_MODEL`	Preferred lightweight local model for those helpers
`CHAT_AUX_TASK_MODEL_EXTERNAL`	Preferred lightweight external model for those helpers

Recommended pattern:

keep your main chat on a stronger model such as Claude Opus
send title / tag / follow-up generation to a cheaper and faster model such as gpt-5.4-mini

OpenWebUI Runtime Patches

This repo includes runtime patches to make OpenWebUI behave better for coding-agent usage.

Current patch areas include:

better handling of code-interpreter output
preserving assistant text after code execution blocks
same-origin /local-file/* pass-through routing
routing title / tag / follow-up generation to a separate model

Files:

open-webui-runtime-patches/main.py
open-webui-runtime-patches/middleware.py

Performance / History Maintenance

Long reasoning and tool-heavy chats can make OpenWebUI sluggish. This repo includes helper scripts to keep things usable.

Included utilities:

scripts/compact_openwebui_reasoning.py
- compacts old reasoning / tool / code-interpreter artifacts from chat history
scripts/cleanup_openwebui_backups.sh
- prunes old backup DB files

Good companion workflow:

keep recent turns verbose
compact older turns
keep signed file links for evidence and reports

Nginx and macOS launchd

Nginx

The repo includes a reverse-proxy template with:

OpenWebUI routing
/local-file/* viewer routing
/api/tools/* routing
/qdrant/* routing
optional geo filtering

Template:

nginx/conf.d/ai-server.conf

launchd

Template plists are included for:

Codex proxy
Claude proxy
backup cleanup

See:

launchd/README.md

How the proxies behave

Codex proxy

translates Codex app-server notifications into OpenAI-style streaming chunks
supports reasoning output in OpenWebUI
supports parallel turns instead of forcing a single serialized queue
rewrites eligible local-file markdown links into signed viewer links
can hide tool bodies while still showing compact tool headers

Claude proxy

translates claude -p --output-format stream-json into OpenAI-style SSE
supports thinking output in OpenWebUI
keeps long-running streams alive
rewrites eligible local-file markdown links into signed viewer links
supports signed links for both absolute workspace paths and uniquely resolvable relative file references
can hide tool bodies while still showing compact tool headers

Files to care about first

codex-proxy.mjs
claude-proxy.mjs
config.json
configure.mjs
docker-compose.yml
local-file-viewer/main.py
tool-api/mcp_server.py
open-webui-runtime-patches/main.py
open-webui-runtime-patches/middleware.py

Why this repo is useful

If you like OpenWebUI but do most of your serious work in Codex CLI or Claude Code, this repo gives you:

a better frontend
a shared research memory layer
cleaner tool/reasoning controls
clickable local evidence links
less UI pain when chats get large

It turns OpenWebUI from “generic chat frontend” into something much closer to a practical coding and security-research workspace.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openwebui-llm-proxy

Highlights

Who is this for?

Architecture

Quick Start

Prerequisites

1. Clone and configure

2. Start Docker services

3. Start the proxies

4. Connect from OpenWebUI

What You Get

1. OpenAI-compatible Codex / Claude endpoints

2. Runtime display control

3. Signed local file viewer

4. Vector DB for research memory

5. MCP read/write tools

5. Chat task routing and toggles

OpenWebUI Runtime Patches

Performance / History Maintenance

Nginx and macOS launchd

Nginx

launchd

How the proxies behave

Codex proxy

Claude proxy

Files to care about first

Why this repo is useful

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
codex-sandbox		codex-sandbox
data		data
launchd		launchd
local-file-viewer		local-file-viewer
nginx		nginx
open-webui-runtime-patches		open-webui-runtime-patches
openwebui-tools		openwebui-tools
scripts		scripts
tool-api		tool-api
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
claude-proxy.mjs		claude-proxy.mjs
codex-proxy.mjs		codex-proxy.mjs
config.json		config.json
configure.mjs		configure.mjs
docker-compose.yml		docker-compose.yml
package.json		package.json
start.sh		start.sh

Folders and files

Latest commit

History

Repository files navigation

openwebui-llm-proxy

Highlights

Who is this for?

Architecture

Quick Start

Prerequisites

1. Clone and configure

2. Start Docker services

3. Start the proxies

4. Connect from OpenWebUI

What You Get

1. OpenAI-compatible Codex / Claude endpoints

2. Runtime display control

3. Signed local file viewer

4. Vector DB for research memory

5. MCP read/write tools

5. Chat task routing and toggles

OpenWebUI Runtime Patches

Performance / History Maintenance

Nginx and macOS launchd

Nginx

launchd

How the proxies behave

Codex proxy

Claude proxy

Files to care about first

Why this repo is useful

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages