vllm正确serve但是输出乱码

使用vllm（0.9.1版本）进行启动，启动命令如下，使用的8*A100显卡，并且已经根据教程将architectures改为MiniMaxText01ForCausalLM

```
export SAFETENSORS_FAST_GPU=1
export VLLM_USE_V1=0
VLLM_LOGGING_CONFIG_PATH=vllm_log_config.json python -u -m vllm.entrypoints.openai.api_server \
    --model open_source_models/MiniMax-M1-80k \
    --tensor-parallel-size 8 \
    --trust-remote-code \
    --quantization experts_int8  \
    --max_model_len 4096 \
    --dtype bfloat16
```

server启动正常，但是使用client请求后，输出部分为乱码，请求代码如下：
```
chat_response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
        {"role": "user", "content": [{"type": "text", "text": "Who won the world series in 2020?"}]}
    ],
    max_tokens=1024,)

# print("Chat response:", chat_response)
print("Chat think response:",chat_response.choices[0].message.reasoning_content)
print("Chat response:",chat_response.choices[0].message.content)
```

结果如下：
```
Chat think response: None
Chat response: 特点和(from co的背后 మ nameSuggestionxin physiologic……（乱码循环）
```
请问下可能是什么原因呢

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm正确serve但是输出乱码 #26

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

vllm正确serve但是输出乱码 #26

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions