-
Notifications
You must be signed in to change notification settings - Fork 938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
_LlamaModel.metadata() does not return tokenizer.ggml.tokens
#1495
Comments
For reference, here is the code I am currently using to read GGUF metadata from a file, including This code is not as robust as the code currently in llama-cpp-python, but I have been using it for a long time and it does work as expected with various models - you can cross-reference with the llama.cpp backend output to verify this. Hopefully this code can be useful as a reference while looking into this issue. Thanks, @abetlen! |
The problem is that the llama.cpp API does not return metadata arrays. Using |
The code I shared doesn't load the model, it just reads bytes from the header of the GGUF file, unless I'm woefully misunderstanding something. Could you please clarify what you mean by loading the model twice? |
I meant if going the route of side-loading the metadata it would probably be best using the official The better approach would be to add support for metadata arrays in the llama.cpp API, but I guess it is a little cumbersome to expose somewhat cleanly through a C ABI. |
Oh, I see. I'll take a look at gguf.py and see if I can make a PR that would help sometime soon. Thanks |
Do note that the currently published |
I am working on a proper fix for this: |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Expected Behavior
The dictionary returned by
_LlamaModel.metadata()
should includetokenizer.ggml.tokens
as a key, so the vocabulary of the model can be accessed from the high-level API.Current Behavior
The dictionary returned by
_LlamaModel.metadata()
does not includetokenizer.ggml.tokens
as a key, so the vocabulary of the model cannot be accessed from the high-level API.Environment and Context
Running latest llama-cpp-python built from source - package version 0.2.76 at the time of writing.
Steps to Reproduce
Llama
Llama.metadata
tokenizer.ggml.metadata
The text was updated successfully, but these errors were encountered: