Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing exaone3.5 #1480

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

KareemMusleh
Copy link
Contributor

This is my first attempt at an implementation of exaone into unsloth, it was requested in this issue.

I didn't want to implement exaone into a separate model class because exaone follows the llama architecture as was discussed in this issue. As of now I am having problems with the from_pretrained function when using config and state_dict. I've already opened an issue in transformers about it. I'll try to solve that problem soon.

@KareemMusleh KareemMusleh changed the title Implementing exaone3 Implementing exaone3.5 Dec 27, 2024
@qingy1337
Copy link
Contributor

qingy1337 commented Dec 28, 2024

@KareemMusleh Thanks for the effort! I was just curious of the general process you go through to implement something like this. Is it different for every model, and how so?

@KareemMusleh KareemMusleh marked this pull request as draft December 28, 2024 11:48
@KareemMusleh
Copy link
Contributor Author

@qingy1337 I was thinking about creating a writeup for this (assuming this gets accepted), as this is my first pull request to a large public repo.

But the gist of it is this:
I started out by googling how other people have implemented exaone into their codebase, and I came up with 3 possible solutions:

  1. Write the exaone class from scratch like they did in vLLM
  2. Upload the exaone models to huggingface as llama models, as was done by maywell
  3. Map exaone to llama, which is what I did, as was recommended in this issue

This is different than previous model integrations because exaone has the exact same architecture as llama

@KareemMusleh
Copy link
Contributor Author

For this PR to be ready the all attention refactor fix PR should be merged first. Because it uses the latest transformers

@KareemMusleh KareemMusleh marked this pull request as ready for review January 8, 2025 06:29
@KareemMusleh
Copy link
Contributor Author

Hey @danielhanchen this seems ready to go!
I've tested it on this notebook and it seems fine.
I don't see a reason to test/integrate exaone3.0 as it seems to be just a worse version of exaone3.5

@danielhanchen
Copy link
Contributor

Appreciate it immensely - so sorry on the delay - I'll review this over the weekend - super great work and thanks! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants