You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Issue:
the llm output generated by llamafile server contains eos_token, specifically </s> for the mistral model in this case:
{"choices":[{"finish_reason":"stop","index":0,"message":{"content":" In Python land, where code is grown,\nExceptions are thrown when errors shown,\nWith try and except in hand,\nWe catch and tame the chaotic band,\nPeace and order in our coding home.</s>","role":"assistant"}}],"created":1717626044,"id":"chatcmpl-50Hu25IQfRBLScWMKdUeRWtt61yhkPb8","model":"gpt-3.5-turbo","object":"chat.completion","usage":{"completion_tokens":48,"prompt_tokens":42,"total_tokens":90}}
Model: I am running the mistral model with ./mistral-7b-instruct-v0.2.Q4_0.llamafile --nobrowser --port 1234. The model is downloaded from the llamafile GitHub page.
More Info:
However, if I use llama.cpp with the same mistral model, the generated output doesn't contain </s>.
Is there any config I am missing?
Version
llamafile v0.8.6
What operating system are you seeing the problem on?
Mac
Relevant log output
Steps to start the server:
1. Download the mistral-7b-instruct model from the main llamafile GitHub page.
2. chmod +x mistral-7b-instruct-v0.2.Q4_0.llamafile
3. start the server:
`./mistral-7b-instruct-v0.2.Q4_0.llamafile --nobrowser --port 1234`
the curl request:
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{ "response_format":"yes","model": "gpt-3.5-turbo","messages": [{ "role": "system", "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."},{ "role": "user", "content": "Write a limerick about python exceptions"}]}'
The text was updated successfully, but these errors were encountered:
Contact Details
tybalex@gmail.com
What happened?
The Issue:
the llm output generated by llamafile server contains
eos_token
, specifically</s>
for the mistral model in this case:Model: I am running the mistral model with
./mistral-7b-instruct-v0.2.Q4_0.llamafile --nobrowser --port 1234
. The model is downloaded from the llamafile GitHub page.More Info:
However, if I use llama.cpp with the same mistral model, the generated output doesn't contain
</s>
.Is there any config I am missing?
Version
llamafile v0.8.6
What operating system are you seeing the problem on?
Mac
Relevant log output
The text was updated successfully, but these errors were encountered: