在使用vllm本地部署时，模型不遵循prompt要求的输出格式

### System Info / 系統信息

```
OS: Linux-5.10.134-13.an8.x86_64-x86_64-with-glibc2.39  
Python: 3.10.0  
Transformers: 5.0.0rc0  
PyTorch: 2.9.0+cu128  
CUDA Available: True  
CUDA Version: 12.8  
GPU: NVIDIA L40S  
```

### Who can help? / 谁可以帮助到您？

_No response_

### Information / 问题信息

- [x] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务

### Reproduction / 复现过程

1. 安装依赖  
```
conda create -n autoglm python==3.10 -y  
conda activate autoglm  
pip install -r requirements.txt  
pip install vllm==0.12.0 --force-reinstall --no-cache-dir  
pip install transformers==5.0.0rc0  
pip install -e .  
```
---
2. 启动服务  
```
rm /dev/shm/VLLM_OBJECT_STORAGE_SHM_BUFFER  
export HF_ENDPOINT=https://hf-mirror.com  
python3 -m vllm.entrypoints.openai.api_server \
 --served-model-name autoglm-phone-9b \
 --allowed-local-media-path /   \
 --mm-encoder-tp-mode data \
 --mm_processor_cache_type shm \
 --mm_processor_kwargs "{\"max_pixels\":5000000}" \
 --max-model-len 25480  \
 --chat-template-content-format string \
 --limit-mm-per-prompt "{\"image\":10}" \
 --model zai-org/AutoGLM-Phone-9B \
 --port 8000
```
---
> 期间 出现
```
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
vllm 0.12.0 requires transformers<5,>=4.56.0, but you have transformers 5.0.0rc0 which is incompatible.
```
```
Unrecognized keys in `rope_parameters` for 'rope_type'='default': {'mrope_section'}
```
> 但查阅相关信息：[1](https://huggingface.co/zai-org/GLM-4.6V/discussions/13); [2](https://github.com/zai-org/Open-AutoGLM#:~:text=%E6%B3%A8%E6%84%8F%3A%20%E4%B8%8A%E8%BF%B0%E6%AD%A5%E9%AA%A4%E5%87%BA%E7%8E%B0%E7%9A%84%E5%85%B3%E4%BA%8E%20transformers%20%E7%9A%84%E4%BE%9D%E8%B5%96%E5%86%B2%E7%AA%81%E5%8F%AF%E4%BB%A5%E5%BF%BD%E7%95%A5%E3%80%82)，发现可忽略。
---
3. 执行测试脚本  
```
python scripts/check_deployment_cn.py --base-url http://localhost:8000/v1 --model autoglm-phone-9b
```
---
4. 模型输出如下，并未携带`think`与`answer`  
```
开始测试模型推理...
Base URL: http://localhost:8000/v1
Model: autoglm-phone-9b
Messages file: scripts/sample_messages.json
================================================================================

模型推理结果:
================================================================================
用户想要比较这个洗发水在京东和淘宝上的价格，然后选择最便宜的平台下单。当前在小红书app上，显示的是一个关于LUMMI MOOD洗发水的帖子。

我需要：
1. 先启动京东app，搜索这个洗发水
2. 查看京东的价格
3. 再启动淘宝app，搜索这个洗发水
4. 查看淘宝的价格
5. 比较价格后，选择最便宜的京东或淘宝下单

首先，我需要从当前的小红书界面退出，然后启动京东app。
do(action="Launch", app="京东")
================================================================================

统计信息:
  - Prompt tokens: 5127
  - Completion tokens: 130
  - Total tokens: 5257

请根据上述推理结果判断模型部署是否符合预期。
```

### Expected behavior / 期待表现

# 如果是我的操作流程出现了问题，真诚期待相关人员的指正。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

在使用vllm本地部署时，模型不遵循prompt要求的输出格式 #324

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

如果是我的操作流程出现了问题，真诚期待相关人员的指正。

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

在使用vllm本地部署时，模型不遵循prompt要求的输出格式 #324

Description

System Info / 系統信息

Who can help? / 谁可以帮助到您？

Information / 问题信息

Reproduction / 复现过程

Expected behavior / 期待表现

如果是我的操作流程出现了问题，真诚期待相关人员的指正。

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions