-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Open
Description
System Info / 系統信息
OS: Linux-5.10.134-13.an8.x86_64-x86_64-with-glibc2.39
Python: 3.10.0
Transformers: 5.0.0rc0
PyTorch: 2.9.0+cu128
CUDA Available: True
CUDA Version: 12.8
GPU: NVIDIA L40S
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
- The official example scripts / 官方的示例脚本
- My own modified scripts / 我自己修改的脚本和任务
Reproduction / 复现过程
- 安装依赖
conda create -n autoglm python==3.10 -y
conda activate autoglm
pip install -r requirements.txt
pip install vllm==0.12.0 --force-reinstall --no-cache-dir
pip install transformers==5.0.0rc0
pip install -e .
- 启动服务
rm /dev/shm/VLLM_OBJECT_STORAGE_SHM_BUFFER
export HF_ENDPOINT=https://hf-mirror.com
python3 -m vllm.entrypoints.openai.api_server \
--served-model-name autoglm-phone-9b \
--allowed-local-media-path / \
--mm-encoder-tp-mode data \
--mm_processor_cache_type shm \
--mm_processor_kwargs "{\"max_pixels\":5000000}" \
--max-model-len 25480 \
--chat-template-content-format string \
--limit-mm-per-prompt "{\"image\":10}" \
--model zai-org/AutoGLM-Phone-9B \
--port 8000
期间 出现
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
vllm 0.12.0 requires transformers<5,>=4.56.0, but you have transformers 5.0.0rc0 which is incompatible.
Unrecognized keys in `rope_parameters` for 'rope_type'='default': {'mrope_section'}
- 执行测试脚本
python scripts/check_deployment_cn.py --base-url http://localhost:8000/v1 --model autoglm-phone-9b
- 模型输出如下,并未携带
think与answer
开始测试模型推理...
Base URL: http://localhost:8000/v1
Model: autoglm-phone-9b
Messages file: scripts/sample_messages.json
================================================================================
模型推理结果:
================================================================================
用户想要比较这个洗发水在京东和淘宝上的价格,然后选择最便宜的平台下单。当前在小红书app上,显示的是一个关于LUMMI MOOD洗发水的帖子。
我需要:
1. 先启动京东app,搜索这个洗发水
2. 查看京东的价格
3. 再启动淘宝app,搜索这个洗发水
4. 查看淘宝的价格
5. 比较价格后,选择最便宜的京东或淘宝下单
首先,我需要从当前的小红书界面退出,然后启动京东app。
do(action="Launch", app="京东")
================================================================================
统计信息:
- Prompt tokens: 5127
- Completion tokens: 130
- Total tokens: 5257
请根据上述推理结果判断模型部署是否符合预期。
Expected behavior / 期待表现
如果是我的操作流程出现了问题,真诚期待相关人员的指正。
Metadata
Metadata
Assignees
Labels
No labels