Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: eos_token in LLM generated output #465

Open
tybalex opened this issue Jun 5, 2024 · 2 comments
Open

Bug: eos_token in LLM generated output #465

tybalex opened this issue Jun 5, 2024 · 2 comments

Comments

@tybalex
Copy link

tybalex commented Jun 5, 2024

Contact Details

tybalex@gmail.com

What happened?

The Issue:
the llm output generated by llamafile server contains eos_token, specifically </s> for the mistral model in this case:

{"choices":[{"finish_reason":"stop","index":0,"message":{"content":" In Python land, where code is grown,\nExceptions are thrown when errors shown,\nWith try and except in hand,\nWe catch and tame the chaotic band,\nPeace and order in our coding home.</s>","role":"assistant"}}],"created":1717626044,"id":"chatcmpl-50Hu25IQfRBLScWMKdUeRWtt61yhkPb8","model":"gpt-3.5-turbo","object":"chat.completion","usage":{"completion_tokens":48,"prompt_tokens":42,"total_tokens":90}}

Model: I am running the mistral model with ./mistral-7b-instruct-v0.2.Q4_0.llamafile --nobrowser --port 1234. The model is downloaded from the llamafile GitHub page.

More Info:
However, if I use llama.cpp with the same mistral model, the generated output doesn't contain </s>.

Is there any config I am missing?

Version

llamafile v0.8.6

What operating system are you seeing the problem on?

Mac

Relevant log output

Steps to start the server:
1. Download the mistral-7b-instruct model from the main llamafile GitHub page.
2. chmod +x mistral-7b-instruct-v0.2.Q4_0.llamafile
3. start the server:
`./mistral-7b-instruct-v0.2.Q4_0.llamafile --nobrowser --port 1234`


the curl request:

curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{ "response_format":"yes",
"model": "gpt-3.5-turbo",
"messages": [
{
    "role": "system",
    "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."
},
{
    "role": "user",
    "content": "Write a limerick about python exceptions"
}
]
}'
@tybalex
Copy link
Author

tybalex commented Jun 11, 2024

just curious no one else ever have the same issue?

@Atcold
Copy link

Atcold commented Jun 18, 2024

I'm observing a </s> when using llava-v1.5-7b-q4.llamafile in autocompletion mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants