Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the fixed eot_token mechanism for SFT #927

Merged
merged 4 commits into from
Oct 30, 2024

Conversation

Xingfu-Yi
Copy link
Contributor

Background

Not all pretrained LLMs use <|endoftext|> as the eot_token, therefore it's inappropriate to fix it.

Changes

  • Removed the hardcoded eot_token: args.end_of_conversation_token = "<|endoftext|>".
  • Added a new argument in the parser called eot_token which is <|endoftext|> by default. Users can manually set the token according to the pretrained model they use.

Not all pretrained LLMs use `<|endoftext|>` as the `eot_token`, therefore it's inappropriate to fix it.
@Xingfu-Yi
Copy link
Contributor Author

Hi @arashb, @duli2012, @awan-10, @eltonzheng,

I hope you're doing well. When you have a moment, could you kindly take a look at this PR? It has already received one approval, but it seems to be stuck and needs further reviews to move forward.

Thank you so much in advance for your time and help.

Best regards,
Yi

@loadams
Copy link
Contributor

loadams commented Oct 29, 2024

Hi @arashb, @duli2012, @awan-10, @eltonzheng,

I hope you're doing well. When you have a moment, could you kindly take a look at this PR? It has already received one approval, but it seems to be stuck and needs further reviews to move forward.

Thank you so much in advance for your time and help.

Best regards, Yi

Hi @Xingfu-Yi - we will work on getting this PR merged, sorry for the delay.

@loadams loadams merged commit eefb0ef into microsoft:master Oct 30, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants