Qwen-Proxy is a proxy service that converts https://chat.qwen.ai and Qwen Code / Qwen Cli into an OpenAI-compatible API. With this project, you only need one account to use any OpenAI API-compatible client (such as ChatGPT-Next-Web, LobeChat, etc.) to call the various models from https://chat.qwen.ai and Qwen Code / Qwen Cli. Models under the /cli endpoint are provided by Qwen Code / Qwen Cli, supporting 256K context and native tools parameter support.
Key Features:
- Compatible with OpenAI API format, seamlessly integrating with various clients
- Compatible with Anthropic Messages API (
/v1/messages), supporting Claude Code, Anthropic SDK, and other clients - Supports Function Calling (OpenAI
tools/ Anthropictools), including streamingargumentsincremental sharding andtool_choice=requiredstrict validation retry - Supports multi-account polling for higher availability
- Supports streaming and non-streaming responses
- Supports multimodal (image recognition, video understanding, image/video generation)
- Supports OpenAI-style resource endpoints:
/v1/images/generations,/v1/images/edits,/v1/videos - Supports intelligent search, deep thinking, and other advanced features
- Supports CLI endpoint with 256K context and tool calling capabilities
- Provides a Web management interface for easy configuration and monitoring
- Bulk account addition supports real-time progress display; login concurrency can be adjusted in system settings
Important:
chat.qwen.aihas rate limiting per single IP. This limitation is currently known to be IP-related only and not associated with cookies.
Solution:
For high concurrency usage, it is recommended to use a proxy pool for IP rotation:
| Solution | Configuration | Description |
|---|---|---|
| Option 1 | PROXY_URL + ProxyFlow |
Configure proxy address directly; all requests rotate IPs through the proxy pool |
| Option 2 | QWEN_CHAT_PROXY_URL + UrlProxy + ProxyFlow |
Achieve more flexible IP rotation through reverse proxy + proxy pool combination |
Configuration Examples:
# Option 1: Use proxy pool directly
PROXY_URL=http://127.0.0.1:8282 # ProxyFlow proxy address
# Option 2: Reverse proxy + proxy pool combination
QWEN_CHAT_PROXY_URL=http://127.0.0.1:8000/qwen # UrlProxy reverse proxy address (UrlProxy configured with HTTP_PROXY pointing to ProxyFlow)- Node.js 18+ (required for source code deployment)
- Docker (optional)
- Redis (optional, for data persistence)
Create a .env file and configure the following parameters:
# 🌐 Service Configuration
LISTEN_ADDRESS=localhost # Listen address
SERVICE_PORT=3000 # Service port
# 🔐 Security Configuration
API_KEY=sk-123456,sk-456789 # API keys (required, supports multiple keys)
ACCOUNTS= # Account configuration (format: user1:pass1,user2:pass2)
# 🚀 PM2 Multi-Process Configuration
PM2_INSTANCES=1 # PM2 instance count (1/number/max)
PM2_MAX_MEMORY=1G # PM2 memory limit (100M/1G/2G etc.)
# Note: In PM2 cluster mode, all processes share the same port
# 🔍 Feature Configuration
SEARCH_INFO_MODE=table # Search info display mode (table/text)
OUTPUT_THINK=true # Whether to output thinking process (true/false)
SIMPLE_MODEL_MAP=false # Simplified model mapping (true/false)
# 🌐 Proxy & Reverse Proxy Configuration
QWEN_CHAT_PROXY_URL= # Custom Chat API reverse proxy URL (default: https://chat.qwen.ai)
QWEN_CLI_PROXY_URL= # Custom CLI API reverse proxy URL (default: https://portal.qwen.ai)
PROXY_URL= # HTTP/HTTPS/SOCKS5 proxy address (e.g., http://127.0.0.1:7890)
# 🗄️ Data Storage
DATA_SAVE_MODE=none # Data persistence mode (none/file/redis)
REDIS_URL= # Redis connection URL (optional, use rediss:// for TLS)
BATCH_LOGIN_CONCURRENCY=5 # Login concurrency for bulk account addition
# 📸 Cache Configuration
CACHE_MODE=default # Image cache mode (default/file)| Parameter | Description | Example |
|---|---|---|
LISTEN_ADDRESS |
Service listen address | localhost or 0.0.0.0 |
SERVICE_PORT |
Service port | 3000 |
API_KEY |
API access keys, supports multiple keys. The first key is the admin key (can access the frontend management page); others are regular keys (API calls only). Separate multiple keys with commas. | sk-admin123,sk-user456,sk-user789 |
PM2_INSTANCES |
PM2 instance count | 1/4/max |
PM2_MAX_MEMORY |
PM2 memory limit | 100M/1G/2G |
SEARCH_INFO_MODE |
Search result display format | table or text |
OUTPUT_THINK |
Whether to show AI thinking process | true or false |
SIMPLE_MODEL_MAP |
Simplified model mapping, returns only base models without variants | true or false |
QWEN_CHAT_PROXY_URL |
Custom Chat API reverse proxy address | https://your-proxy.com |
QWEN_CLI_PROXY_URL |
Custom CLI API reverse proxy address | https://your-cli-proxy.com |
PROXY_URL |
Outbound request proxy, supports HTTP/HTTPS/SOCKS5 | http://127.0.0.1:7890 |
DATA_SAVE_MODE |
Data persistence method | none/file/redis |
REDIS_URL |
Redis database connection URL; use rediss:// protocol for TLS |
redis://localhost:6379 or rediss://xxx.upstash.io |
BATCH_LOGIN_CONCURRENCY |
Login concurrency for bulk account addition; adjustable in the frontend system settings | 5 |
CACHE_MODE |
Image cache storage method | default/file |
LOG_LEVEL |
Log level | DEBUG/INFO/WARN/ERROR |
ENABLE_FILE_LOG |
Whether to enable file logging | true or false |
LOG_DIR |
Log file directory | ./logs |
MAX_LOG_FILE_SIZE |
Maximum log file size (MB) | 10 |
MAX_LOG_FILES |
Number of log files to retain | 5 |
💡 Tip: You can create a free Redis instance at Upstash. When using TLS protocol, the address format is
rediss://...
The API_KEY environment variable supports multiple API keys for different permission levels:
Configuration Format:
# Single key (admin permission)
API_KEY=sk-admin123
# Multiple keys (first is admin, others are regular users)
API_KEY=sk-admin123,sk-user456,sk-user789Permission Levels:
| Key Type | Permission Scope | Features |
|---|---|---|
| Admin Key | Full permissions | • Access frontend management page • Modify system settings • Call all API endpoints • Add/delete regular keys |
| Regular Key | API call permissions | • API calls only • Cannot access frontend management page • Cannot modify system settings |
Use Cases:
- Team collaboration: Assign different permission-level API keys to team members
- Application integration: Provide restricted API access to third-party applications
- Security isolation: Separate admin permissions from regular usage permissions
Notes:
- The first API_KEY automatically becomes the admin key with the highest permissions
- Admins can dynamically add or delete regular keys via the frontend
- All keys can call API endpoints normally; permission differences only apply to management features
The CACHE_MODE environment variable controls how image caches are stored to optimize image upload and processing performance:
| Mode | Description | Use Case |
|---|---|---|
default |
In-memory cache (default) | Single-process deployment; cache is lost on restart |
file |
File cache mode | Multi-process deployment; cache persists to ./caches/ directory |
Recommended Configuration:
- Single-process deployment: Use
CACHE_MODE=defaultfor best performance - Multi-process/cluster deployment: Use
CACHE_MODE=fileto ensure cross-process cache sharing - Docker deployment: Recommend
CACHE_MODE=filewith./cachesdirectory mounted
File Cache Directory Structure:
caches/
├── [signature1].txt # Cache file containing image URL
├── [signature2].txt
└── ...
docker run -d \
-p 3000:3000 \
-e API_KEY=sk-admin123,sk-user456,sk-user789 \
-e DATA_SAVE_MODE=none \
-e CACHE_MODE=file \
-e ACCOUNTS= \
-v ./caches:/app/caches \
--name qwen2api \
rfym21/qwen2api:latest# Download configuration file
curl -o docker-compose.yml https://raw.githubusercontent.com/Rfym21/Qwen2API/refs/heads/main/docker/docker-compose.yml
# Start service
docker compose pull && docker compose up -d# Clone project
git clone https://github.com/Rfym21/Qwen2API.git
cd Qwen2API
# Install dependencies
npm install
# Configure environment variables
cp .env.example .env
# Edit the .env file
# Smart start (recommended - automatically determines single/multi-process)
npm start
# Development mode
npm run devUse PM2 for production multi-process deployment for better performance and stability.
Important: In PM2 cluster mode, all processes share the same port and PM2 automatically load balances.
Using npm start automatically determines the startup mode:
- When
PM2_INSTANCES=1, uses single-process mode - When
PM2_INSTANCES>1, uses Node.js cluster mode - Automatically limits process count to not exceed CPU core count
Quick deployment to Hugging Face Spaces:
Qwen2API/
├── README.md
├── ecosystem.config.js # PM2 configuration file
├── package.json
│
├── docker/ # Docker configuration directory
│ ├── Dockerfile
│ ├── docker-compose.yml
│ └── docker-compose-redis.yml
│
├── caches/ # Cache file directory
├── data/ # Data file directory
│ ├── data.json
│ └── data_template.json
├── scripts/ # Scripts directory
│ └── fingerprint-injector.js # Browser fingerprint injection script
│
├── src/ # Backend source code directory
│ ├── server.js # Main server file
│ ├── start.js # Smart start script (auto single/multi-process)
│ ├── config/
│ │ └── index.js # Configuration file
│ ├── controllers/ # Controllers directory
│ │ ├── chat.js # Chat controller
│ │ ├── chat.image.video.js # Image/video generation controller
│ │ ├── cli.chat.js # CLI chat controller
│ │ └── models.js # Models controller
│ ├── middlewares/ # Middlewares directory
│ │ ├── authorization.js # Authorization middleware
│ │ └── chat-middleware.js # Chat middleware
│ ├── models/ # Models directory
│ │ └── models-map.js # Model mapping configuration
│ ├── routes/ # Routes directory
│ │ ├── accounts.js # Accounts route
│ │ ├── chat.js # Chat route
│ │ ├── cli.chat.js # CLI chat route
│ │ ├── models.js # Models route
│ │ ├── settings.js # Settings route
│ │ └── verify.js # Verification route
│ └── utils/ # Utility functions directory
│ ├── account-rotator.js # Account rotator
│ ├── account.js # Account management
│ ├── chat-helpers.js # Chat helper functions
│ ├── cli.manager.js # CLI manager
│ ├── cookie-generator.js # Cookie generator
│ ├── data-persistence.js # Data persistence
│ ├── fingerprint.js # Browser fingerprint generation
│ ├── img-caches.js # Image cache
│ ├── logger.js # Logger utility
│ ├── precise-tokenizer.js # Precise tokenizer
│ ├── proxy-helper.js # Proxy helper functions
│ ├── redis.js # Redis connection
│ ├── request.js # HTTP request wrapper
│ ├── setting.js # Settings management
│ ├── ssxmod-manager.js # ssxmod parameter manager
│ ├── token-manager.js # Token manager
│ ├── tools.js # Tool call handler
│ └── upload.js # File upload
│
└── public/ # Frontend project directory
├── dist/ # Compiled frontend files
│ ├── assets/ # Static assets
│ ├── favicon.png
│ └── index.html
├── src/ # Frontend source code
│ ├── App.vue # Main application component
│ ├── main.js # Entry file
│ ├── style.css # Global styles
│ ├── assets/ # Static assets
│ │ └── background.mp4
│ ├── routes/ # Route configuration
│ │ └── index.js
│ └── views/ # Page components
│ ├── auth.vue # Authentication page
│ ├── dashboard.vue # Dashboard page
│ └── settings.vue # Settings page
├── package.json # Frontend dependency configuration
├── package-lock.json
├── index.html # Frontend entry HTML
├── postcss.config.js # PostCSS configuration
├── tailwind.config.js # TailwindCSS configuration
├── vite.config.js # Vite build configuration
└── public/ # Public static assets
└── favicon.png
This API supports multi-key authentication. All API requests must include a valid API key in the request header:
Authorization: Bearer sk-your-api-keySupported Key Types:
- Admin Key: The first configured API_KEY with full permissions
- Regular Key: Other configured API_KEYs for API calls only
Authentication Examples:
# Using admin key
curl -H "Authorization: Bearer sk-admin123" http://localhost:3000/v1/models
# Using regular key
curl -H "Authorization: Bearer sk-user456" http://localhost:3000/v1/chat/completionsGet all available AI models.
GET /v1/models
Authorization: Bearer sk-your-api-keyGET /models (no authentication required)Notes:
id: Recommended to use directly as themodelfield in requests; shows more readable model names firstname: Upstream original model ID, for comparison with official interfaces or logsupstream_id: Upstream model ID without capability suffixesdisplay_name: Display name without capability suffixes- When
SIMPLE_MODEL_MAP=false, additional capability variants are returned:-thinking,-search,-image,-video,-image-edit
Response Example:
{
"object": "list",
"data": [
{
"id": "Qwen3-Omni-Flash-image",
"name": "qwen3-omni-flash-2025-12-01-image",
"upstream_id": "qwen3-omni-flash-2025-12-01",
"display_name": "Qwen3-Omni-Flash",
"object": "model",
"created": 1677610602,
"owned_by": "qwen"
}
]
}Send chat messages and receive AI replies.
POST /v1/chat/completions
Content-Type: application/json
Authorization: Bearer sk-your-api-keyRequest Body:
{
"model": "Qwen3.6-Plus",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello, please introduce yourself."
}
],
"stream": false,
"temperature": 0.7,
"max_tokens": 2000
}Response Example:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "qwen3.6-plus",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I am an AI assistant..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 50,
"total_tokens": 70
}
}/v1/chat/completions supports the full OpenAI Function Calling protocol. Even if the upstream web interface does not natively support tools, this service achieves OpenAI API-consistent behavior through prompt injection and streaming state machine parsing:
- Automatically compresses
tools[]into TypeScript-style signatures injected into the prompt, saving approximately 70% token usage - Streaming output follows OpenAI spec sharding: first emits a
function.name + empty argumentsheader block, followed by multipleargumentschunks - Historical messages with
assistant.tool_callsandrole:"tool"are automatically folded back into the chain, withtool_call_idprecisely linked - Full
tool_choicesupport:"auto"/"required"/{type:"function",function:{name:"..."}}/"none" - When
tool_choice="required"or a specific function is specified, if the first attempt does not trigger a tool call, a strong constraint prompt is appended for one automatic retry
Request Example:
{
"model": "qwen3-coder-plus",
"stream": true,
"messages": [
{"role": "user", "content": "Check the weather in Beijing"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get city weather",
"parameters": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}
}
],
"tool_choice": "required"
}Streaming Response (excerpt):
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_xxx","type":"function","function":{"name":"get_weather","arguments":""}}]}}]}
data: {"choices":[{"delta":{"tool_calls":[{"index":0,"function":{"arguments":"{\"city\":\"Beijing\"}"}}]}}]}
data: {"choices":[{"delta":{},"finish_reason":"tool_calls"}]}
data: [DONE]
Clients that follow the OpenAI tool protocol — such as OpenAI SDK, LangChain, Cline, and Continue — can connect directly.
Compatible with Anthropic's /v1/messages endpoint. Supports direct integration with Claude Code, Anthropic SDK, aider, and other clients.
POST /v1/messages
Content-Type: application/json
Authorization: Bearer sk-your-api-keySupported fields:
| Field | Description |
|---|---|
model |
Any Qwen model name (e.g., qwen3-coder-plus) |
system |
String or array of {type:"text"} blocks |
messages[].content |
String, text blocks, image blocks, tool_use blocks, tool_result blocks |
tools[] |
Anthropic-style {name,input_schema,description} |
tool_choice |
{type:"auto"} / {type:"any"} (must call) / {type:"tool",name:"..."} / {type:"none"} |
thinking |
{type:"enabled",budget_tokens:N} to enable thinking mode |
stream |
Streaming SSE output |
Request Example (with tool calling):
{
"model": "qwen3-coder-plus",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Check the weather in Guangzhou"}
],
"tools": [
{
"name": "get_weather",
"input_schema": {
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"]
}
}
],
"tool_choice": { "type": "any" }
}Non-streaming Response:
{
"id": "msg_xxx",
"type": "message",
"role": "assistant",
"model": "qwen3-coder-plus",
"content": [
{
"type": "tool_use",
"id": "call_xxx",
"name": "get_weather",
"input": { "city": "Guangzhou" }
}
],
"stop_reason": "tool_use",
"stop_sequence": null,
"usage": { "input_tokens": 233, "output_tokens": 25 }
}Streaming SSE Event Sequence:
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"call_xxx","name":"get_weather","input":{}}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"city\":\"Guangzhou\"}"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"input_tokens":234,"output_tokens":25}}
event: message_stop
data: {"type":"message_stop"}
Two invocation methods are supported:
- Use
/v1/chat/completions+ model suffix:-image,-image-edit,-video - Use OpenAI-style resource endpoints:
/v1/images/generations,/v1/images/edits,/v1/videos
Use the id field from /v1/models as the model name in the examples below.
Text-to-image:
{
"model": "Qwen3-Omni-Flash-image",
"messages": [
{
"role": "user",
"content": "Draw a kitten playing in a garden, cartoon style"
}
],
"size": "1:1",
"stream": false
}Image editing:
{
"model": "Qwen3-Omni-Flash-image-edit",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Change this image into a light blue tech-style poster"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,..."
}
}
]
}
],
"stream": false
}Video generation:
{
"model": "Qwen3-Omni-Flash-video",
"messages": [
{
"role": "user",
"content": "Generate a 3-second night time-lapse video with city street neon lights flickering"
}
],
"size": "9:16",
"stream": false
}Supported Size Parameters:
- Image/video generation under
/v1/chat/completionssupports:1:1,4:3,3:4,16:9,9:16 /v1/images/generations,/v1/images/edits,/v1/videossupport:1024x1024,1536x1024,1024x1536,1792x1024,1024x1792
Image generation:
POST /v1/images/generations
Content-Type: application/json
Authorization: Bearer sk-your-api-key{
"model": "Qwen3-Omni-Flash",
"prompt": "An orange cat sitting on a wooden table looking at the camera, realistic style",
"size": "1024x1024",
"response_format": "url"
}Image editing:
POST /v1/images/edits
Content-Type: multipart/form-data
Authorization: Bearer sk-your-api-keyForm fields:
model: Optional; if not provided, the default image-editing model is selected automaticallyprompt: Optional; defaults toPlease complete editing based on the uploaded imageimage: Required; supports multipart file upload as well as JSON string image URL / data URIsize: Optional; supports OpenAI-style size formatresponse_format: Optional; supportsurl,b64_json
Video generation:
POST /v1/videos
Content-Type: application/json
Authorization: Bearer sk-your-api-key{
"model": "Qwen3-Omni-Flash",
"prompt": "A short 3-second night time-lapse video with city street neon lights flickering",
"size": "1024x1792"
}Image generation response example:
{
"created": 1776126402,
"data": [
{
"url": "https://cdn.qwenlm.ai/output/example/generated-image.png"
}
]
}Video generation response example:
{
"id": "video_1776126509490",
"object": "video",
"created": 1776126509,
"model": "qwen3-omni-flash-2025-12-01",
"status": "completed",
"data": [
{
"url": "https://cdn.qwenlm.ai/output/example/generated-video.mp4"
}
]
}Add the -search suffix to the model name to enable search:
{
"model": "Qwen3.6-Plus-search",
"messages": [...]
}Add the -thinking suffix to the model name to enable thinking process output:
{
"model": "Qwen3.6-Plus-thinking",
"messages": [...]
}Enable both search and thinking simultaneously:
{
"model": "Qwen3.6-Plus-thinking-search",
"messages": [...]
}The API automatically handles image and video uploads, supporting image/video URLs or Base64 data URIs in conversations.
Image understanding example:
{
"model": "Qwen3.5-Omni-Plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,..."
}
}
]
}
]
}Video understanding example:
{
"model": "Qwen3.5-Omni-Plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Please describe this video in one sentence"
},
{
"type": "input_video",
"input_video": {
"url": "data:video/mp4;base64,..."
}
}
]
}
]
}Supported video fields:
input_videovideo_urlvideo
The CLI endpoint uses OAuth tokens from Qwen Code / Qwen Cli, supporting 256K context and Function Calling.
Supported Models:
| Model ID | Description |
|---|---|
qwen3-coder-plus |
Qwen3 Coder Plus |
qwen3-coder-flash |
Qwen3 Coder Flash (faster) |
coder-model |
Qwen 3.5 Plus (with chain-of-thought, 256K context) |
qwen3.5-plus |
Alias for coder-model, auto-redirected |
Send chat requests through the CLI endpoint, supporting streaming and non-streaming responses.
POST /cli/v1/chat/completions
Content-Type: application/json
Authorization: Bearer API_KEYRequest Body:
{
"model": "qwen3-coder-plus",
"messages": [
{
"role": "user",
"content": "Hello, please introduce yourself."
}
],
"stream": false,
"temperature": 0.7,
"max_tokens": 2000
}Using coder-model (Qwen 3.5 Plus) or its alias qwen3.5-plus:
{
"model": "coder-model",
"messages": [
{
"role": "user",
"content": "Write a quicksort algorithm."
}
],
"stream": false
}Streaming Request:
{
"model": "qwen3-coder-flash",
"messages": [
{
"role": "user",
"content": "Write a poem about spring."
}
],
"stream": true
}Response Format:
Non-streaming response follows the standard OpenAI API format:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "qwen3-coder-plus",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! I am an AI assistant..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 20,
"completion_tokens": 50,
"total_tokens": 70
}
}Streaming response uses Server-Sent Events (SSE) format:
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"qwen3-coder-flash","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"qwen3-coder-flash","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}
data: [DONE]

