Skip to content

在使用vllm本地部署时,模型不遵循prompt要求的输出格式 #324

@K1XE

Description

@K1XE

System Info / 系統信息

OS: Linux-5.10.134-13.an8.x86_64-x86_64-with-glibc2.39  
Python: 3.10.0  
Transformers: 5.0.0rc0  
PyTorch: 2.9.0+cu128  
CUDA Available: True  
CUDA Version: 12.8  
GPU: NVIDIA L40S  

Who can help? / 谁可以帮助到您?

No response

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

  1. 安装依赖
conda create -n autoglm python==3.10 -y  
conda activate autoglm  
pip install -r requirements.txt  
pip install vllm==0.12.0 --force-reinstall --no-cache-dir  
pip install transformers==5.0.0rc0  
pip install -e .  

  1. 启动服务
rm /dev/shm/VLLM_OBJECT_STORAGE_SHM_BUFFER  
export HF_ENDPOINT=https://hf-mirror.com  
python3 -m vllm.entrypoints.openai.api_server \
 --served-model-name autoglm-phone-9b \
 --allowed-local-media-path /   \
 --mm-encoder-tp-mode data \
 --mm_processor_cache_type shm \
 --mm_processor_kwargs "{\"max_pixels\":5000000}" \
 --max-model-len 25480  \
 --chat-template-content-format string \
 --limit-mm-per-prompt "{\"image\":10}" \
 --model zai-org/AutoGLM-Phone-9B \
 --port 8000

期间 出现

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
vllm 0.12.0 requires transformers<5,>=4.56.0, but you have transformers 5.0.0rc0 which is incompatible.
Unrecognized keys in `rope_parameters` for 'rope_type'='default': {'mrope_section'}

但查阅相关信息:1; 2,发现可忽略。


  1. 执行测试脚本
python scripts/check_deployment_cn.py --base-url http://localhost:8000/v1 --model autoglm-phone-9b

  1. 模型输出如下,并未携带thinkanswer
开始测试模型推理...
Base URL: http://localhost:8000/v1
Model: autoglm-phone-9b
Messages file: scripts/sample_messages.json
================================================================================

模型推理结果:
================================================================================
用户想要比较这个洗发水在京东和淘宝上的价格,然后选择最便宜的平台下单。当前在小红书app上,显示的是一个关于LUMMI MOOD洗发水的帖子。

我需要:
1. 先启动京东app,搜索这个洗发水
2. 查看京东的价格
3. 再启动淘宝app,搜索这个洗发水
4. 查看淘宝的价格
5. 比较价格后,选择最便宜的京东或淘宝下单

首先,我需要从当前的小红书界面退出,然后启动京东app。
do(action="Launch", app="京东")
================================================================================

统计信息:
  - Prompt tokens: 5127
  - Completion tokens: 130
  - Total tokens: 5257

请根据上述推理结果判断模型部署是否符合预期。

Expected behavior / 期待表现

如果是我的操作流程出现了问题,真诚期待相关人员的指正。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions