-
Notifications
You must be signed in to change notification settings - Fork 456
[WIP] Remove interleaved thinking for MT thinking model #1135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -403,6 +403,12 @@ def visualize_token_role(tokens: list[int], masks: list[int], tokenizer: PreTrai | |||||||||||||||||||||
| "{% if not has_system %}" | ||||||||||||||||||||||
| "{{ '<|im_start|>system\nYou are Olmo, a helpful AI assistant built by Ai2. Your date cutoff is December 2024, and your model weights are available at https://huggingface.co/allenai.<|im_end|>\n' }}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
| "{% set last_user_index = -1 %}" | ||||||||||||||||||||||
| "{% for message in messages %}" | ||||||||||||||||||||||
| "{% if message['role'] == 'user' %}" | ||||||||||||||||||||||
| "{% set last_user_index = loop.index0 %}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
| "{% endfor %}" | ||||||||||||||||||||||
| "{% for message in messages %}" | ||||||||||||||||||||||
| "{% if message['role'] == 'system' %}" | ||||||||||||||||||||||
| "{{ '<|im_start|>system\n' + message['content'] }}" | ||||||||||||||||||||||
|
|
@@ -418,10 +424,18 @@ def visualize_token_role(tokens: list[int], masks: list[int], tokenizer: PreTrai | |||||||||||||||||||||
| "{{ '<|im_start|>user\n' + message['content'] + '<|im_end|>\n' }}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
| "{% elif message['role'] == 'assistant' %}" | ||||||||||||||||||||||
| "{% set assistant_content = message.get('content', '') %}" | ||||||||||||||||||||||
| "{% set reasoning_content = '' %}" | ||||||||||||||||||||||
| "{% if '</think>' in assistant_content %}" | ||||||||||||||||||||||
| "{% set think_split = assistant_content.split('</think>') %}" | ||||||||||||||||||||||
| "{% set reasoning_content = think_split[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}" | ||||||||||||||||||||||
| "{% set assistant_content = think_split[-1].lstrip('\\n') %}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
|
Comment on lines
+429
to
+433
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current logic for parsing the I suggest a more robust parsing logic that correctly handles content before and after the
Suggested change
|
||||||||||||||||||||||
| "{{ '<|im_start|>assistant\n' }}" | ||||||||||||||||||||||
| "{% if message.get('content', none) is not none %}" | ||||||||||||||||||||||
| "{{ message['content'] }}" | ||||||||||||||||||||||
| "{% if loop.index0 > last_user_index and reasoning_content.strip() %}" | ||||||||||||||||||||||
| "{{ '<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' }}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
| "{{ assistant_content }}" | ||||||||||||||||||||||
| "{% if message.get('function_calls', none) is not none %}" | ||||||||||||||||||||||
| "{{ '<function_calls>' + message['function_calls'] + '</function_calls>' }}" | ||||||||||||||||||||||
| "{% endif %}" | ||||||||||||||||||||||
|
|
||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default system prompt for
olmo_thinker_no_thinkis missing information about function-calling capabilities. Otherolmotemplates include this, and the rest of this template handles functions. This inconsistency could be confusing. For consistency, I suggest updating the default system prompt to mention functions, similar to otherolmotemplates.