llama.cpp生成的量化模型,如何在仿openai中使用? #338
Unanswered
sipingxiaozi
asked this question in
Q&A
Replies: 1 comment 1 reply
-
scripts/下的脚本不支持gguf格式的模型,只能加载bin格式的模型 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
想知道在如下脚本命令中,
$ python scripts/openai_server_demo/openai_api_server.py --base_model /path/to/base_model --only_cpu
如何指定一个生成好的量化gguf文件?
base_model只能指定一个文件目录,而生成的量化目录里同时有gguf文件和原始bin文件。
另外,这里openai不支持mac的gpu吧?调用gpu会在响应时报错:
RuntimeError: Placeholder storage has not been allocated on MPS device!
Beta Was this translation helpful? Give feedback.
All reactions