Use Codex CLI and Claude Code inside OpenWebUI like OpenAI-compatible models, while keeping a security-research-friendly stack around them.
This repo combines:
- OpenAI-compatible proxies for Codex and Claude
- Qdrant + FastEmbed vector search for project memory / RAG
- OpenWebUI runtime patches for better tool / code-interpreter behavior
- A same-origin signed local file viewer for model-generated file references
- Utilities to keep long OpenWebUI chats responsive
If you want the convenience of OpenWebUI with the power of CLI-based coding agents, this is the setup.
-
Bring plan-based CLIs into OpenWebUI Codex CLI and Claude Code show up as normal OpenAI-style models, so you can use them from the OpenWebUI interface without swapping to direct API billing.
-
Streaming that feels native Both proxies translate their native event streams into OpenAI-compatible streaming responses, including reasoning/thinking output when enabled.
-
Parallel Codex turns Codex requests are no longer forced through a single global queue, so multiple chats can make progress at the same time.
-
Runtime display controls Turn thinking, reasoning, tool headers, and tool bodies on or off without editing code.
-
Fast auxiliary chat tasks Keep the main answer on Claude or Codex, while routing title generation, chat tags, and follow-up suggestions to a lighter model.
-
Signed local file viewer Model outputs can link to local workspace files through signed same-origin URLs, with:
- line anchors
- syntax highlighting
- theme switching
- syntax theme switching
- markdown preview
- copy button
-
Built for research workflows Vector DB storage, MCP access, OpenWebUI tools, and ingestion scripts make it easy to store and retrieve:
- vulnerability notes
- source code chunks
- exploit/PoC snippets
- bug bounty notes
-
OpenWebUI UX fixes Includes runtime patches and maintenance scripts for:
- code interpreter truncation issues
- overly heavy historical reasoning/tool artifacts
- backup cleanup
- People using Codex CLI or Claude Code as their primary coding/research interface
- Users who like OpenWebUI as a chat frontend
- Security researchers who want a lightweight local RAG layer for notes, code, exploits, and reports
- Anyone who wants local-file references in chat to open cleanly in-browser instead of breaking into 404s
┌────────────────────────────────────────────────────────────────────┐
│ nginx / same-origin │
│ / → OpenWebUI (:3000) │
│ /local-file/* → local-file-viewer (:8301) │
│ /api/tools/* → Tool API (:8100) │
│ /qdrant/* → Qdrant (:6333) │
└───────────────────────────────┬────────────────────────────────────┘
│
┌───────────────▼────────────────┐
│ OpenWebUI (Docker) │
│ runtime-patched for local-file│
│ routing + code-interpreter UX │
└───────────────┬────────────────┘
│
┌─────────────────┴─────────────────┐
│ │
┌───────▼────────┐ ┌────────▼───────┐
│ codex-proxy │ │ claude-proxy │
│ :8200 │ │ :8201 │
│ OpenAI-style │ │ OpenAI-style │
│ SSE bridge │ │ SSE bridge │
└───────┬────────┘ └────────┬───────┘
│ │
┌───────▼────────┐ ┌────────▼───────┐
│ Codex CLI │ │ Claude Code │
│ app-server │ │ stream-json │
└────────────────┘ └────────────────┘
┌──────────────────┐ ┌──────────────────┐
│ Tool API │─────────────►│ Qdrant │
│ FastAPI + MCP │ │ vector database │
│ + ingest/search │ │ + FastEmbed data │
└──────────────────┘ └──────────────────┘
┌──────────────────┐
│ local-file-viewer│
│ signed links │
│ syntax highlight │
│ markdown preview │
└──────────────────┘
- Node.js 18+
- Docker / Docker Compose
- Codex CLI
- Claude Code
git clone https://github.com/sm1ee/openwebui-llm-proxy.git
cd openwebui-llm-proxy
cp .env.example .envMinimum recommended .env values:
QDRANT_API_KEY=your-qdrant-api-key
OPENCLAW_WORKSPACE_DIR=/absolute/path/to/your/openclaw/workspace
LOCAL_FILE_URL_PREFIX=/absolute/path/to/your/openclaw/workspace
LOCAL_FILE_VIEWER_BASE_URL=https://ai.yourdomain.comUseful optional task-routing values:
# Keep OpenWebUI chat UX helpers enabled
ENABLE_TITLE_GENERATION=true
ENABLE_TAGS_GENERATION=true
ENABLE_FOLLOW_UP_GENERATION=true
# Route title / tag / follow-up generation to a faster external model
CHAT_AUX_TASK_MODEL_EXTERNAL=gpt-5.4-minidocker compose up -d --buildThe bundled compose file pins OpenWebUI to v0.8.12 instead of floating on main,
so upgrades are explicit and reproducible.
This starts:
open-webuitool-apiqdrantlocal-file-viewer
./start.shOr individually:
node codex-proxy.mjs
node claude-proxy.mjsIn OpenWebUI, add OpenAI-compatible connections:
- Codex:
http://host.docker.internal:8200/v1 - Claude:
http://host.docker.internal:8201/v1 - API key: any placeholder string such as
dummy
Both proxies expose:
| Endpoint | Method | Description |
|---|---|---|
/v1/models |
GET |
Available models |
/v1/chat/completions |
POST |
Streaming and non-streaming chat completions |
/health |
GET |
Health check |
You can change what OpenWebUI sees without editing source code.
npm run config -- show
npm run config -- fast
npm run config -- balanced
npm run config -- verboseExamples:
npm run config -- set codex.reasoning on
npm run config -- set codex.toolDisplay on
npm run config -- set codex.toolBodyDisplay off
npm run config -- set claude.thinking off
npm run config -- set claude.toolDisplay on
npm run config -- set claude.toolBodyDisplay offMain toggles:
codex.reasoningcodex.toolDisplaycodex.toolBodyDisplayclaude.thinkingclaude.toolDisplayclaude.toolBodyDisplay
This lets you choose between:
- cleaner/faster chat history
- verbose debugging visibility
- title-only tool traces
- full collapsible tool bodies
One of the nicest quality-of-life features in this repo.
When a model emits a local workspace file reference, the proxies can rewrite it to a signed same-origin link like:
https://ai.yourdomain.com/local-file/t/<signed-token>#L17
That viewer supports:
- line anchors like
#L17 - syntax highlighting with multiple themes
- page theme switching
- markdown code/preview toggle
- one-click copy
- same-origin browser opening instead of dead
/Users/...links
Supported link styles:
- absolute workspace paths such as
/absolute/path/to/workspace/project/report.md - eligible relative workspace references such as
project/report.mdwhen they resolve uniquely inside the mounted workspace
By default the viewer also supports direct authenticated browsing when enabled behind OpenWebUI, and signed links have a TTL.
Relevant settings:
| Variable | Description |
|---|---|
OPENCLAW_WORKSPACE_DIR |
Host workspace directory mounted read-only into the viewer |
LOCAL_FILE_URL_PREFIX |
Absolute path prefix that should be recognized as local file links |
LOCAL_FILE_VIEWER_BASE_URL |
Public base URL used when generating signed links |
LOCAL_FILE_REQUIRE_ADMIN |
Restrict direct browsing to admins |
LOCAL_FILE_MAX_INLINE_BYTES |
Max inline preview size |
LOCAL_FILE_LINK_TTL_SECONDS |
Signed-link expiration |
Qdrant collections are organized per project:
| Collection | Purpose |
|---|---|
{project}_vuln |
advisories, CVEs, findings |
{project}_code |
source code chunks |
{project}_exploit |
exploit and PoC data |
{project}_bugbounty |
bug bounty notes / reports |
Useful scripts:
python3 scripts/init_qdrant.py --project myproject
python3 scripts/ingest_vuln.py --project myproject --dir /path/to/reports
python3 scripts/ingest_code.py --project myproject --path /path/to/src --lang pythonSupported access modes:
- OpenWebUI custom tools via
openwebui-tools/security_research_tools.py - MCP from Codex / Claude via
tool-api/mcp_server.py - Direct CLI ingestion/search through
tool-api/vectordb_cli.py
The MCP server supports both search and ingest flows.
Read tools:
list_projectssearch_vulnsearch_codesearch_exploitsearch_bugbounty
Write tools:
ingest_documentingest_batch
Collections are created automatically when needed.
OpenWebUI normally generates a few chat UX helpers after the main assistant reply:
- chat title
- chat tags
- follow-up suggestions
This repo exposes simple environment toggles for those helpers:
| Variable | Description |
|---|---|
ENABLE_TITLE_GENERATION |
Enable or disable automatic chat titles |
ENABLE_TAGS_GENERATION |
Enable or disable automatic tag generation |
ENABLE_FOLLOW_UP_GENERATION |
Enable or disable follow-up suggestions |
CHAT_AUX_TASK_MODEL |
Preferred lightweight local model for those helpers |
CHAT_AUX_TASK_MODEL_EXTERNAL |
Preferred lightweight external model for those helpers |
Recommended pattern:
- keep your main chat on a stronger model such as Claude Opus
- send title / tag / follow-up generation to a cheaper and faster model such as
gpt-5.4-mini
This repo includes runtime patches to make OpenWebUI behave better for coding-agent usage.
Current patch areas include:
- better handling of code-interpreter output
- preserving assistant text after code execution blocks
- same-origin
/local-file/*pass-through routing - routing title / tag / follow-up generation to a separate model
Files:
open-webui-runtime-patches/main.pyopen-webui-runtime-patches/middleware.py
Long reasoning and tool-heavy chats can make OpenWebUI sluggish. This repo includes helper scripts to keep things usable.
Included utilities:
scripts/compact_openwebui_reasoning.py- compacts old reasoning / tool / code-interpreter artifacts from chat history
scripts/cleanup_openwebui_backups.sh- prunes old backup DB files
Good companion workflow:
- keep recent turns verbose
- compact older turns
- keep signed file links for evidence and reports
The repo includes a reverse-proxy template with:
- OpenWebUI routing
/local-file/*viewer routing/api/tools/*routing/qdrant/*routing- optional geo filtering
Template:
nginx/conf.d/ai-server.conf
Template plists are included for:
- Codex proxy
- Claude proxy
- backup cleanup
See:
launchd/README.md
- translates Codex
app-servernotifications into OpenAI-style streaming chunks - supports reasoning output in OpenWebUI
- supports parallel turns instead of forcing a single serialized queue
- rewrites eligible local-file markdown links into signed viewer links
- can hide tool bodies while still showing compact tool headers
- translates
claude -p --output-format stream-jsoninto OpenAI-style SSE - supports thinking output in OpenWebUI
- keeps long-running streams alive
- rewrites eligible local-file markdown links into signed viewer links
- supports signed links for both absolute workspace paths and uniquely resolvable relative file references
- can hide tool bodies while still showing compact tool headers
codex-proxy.mjsclaude-proxy.mjsconfig.jsonconfigure.mjsdocker-compose.ymllocal-file-viewer/main.pytool-api/mcp_server.pyopen-webui-runtime-patches/main.pyopen-webui-runtime-patches/middleware.py
If you like OpenWebUI but do most of your serious work in Codex CLI or Claude Code, this repo gives you:
- a better frontend
- a shared research memory layer
- cleaner tool/reasoning controls
- clickable local evidence links
- less UI pain when chats get large
It turns OpenWebUI from “generic chat frontend” into something much closer to a practical coding and security-research workspace.
MIT