Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models #1

Open
rbroc opened this issue May 17, 2023 · 5 comments
Open

Models #1

rbroc opened this issue May 17, 2023 · 5 comments
Assignees

Comments

@rbroc
Copy link
Owner

rbroc commented May 17, 2023

Looking both at foundation and instruction tuning models. For this project, the latter is probably going to be the only target, as it would probably work better.

Available

Maybe for later
Not open-source

  • GPT-4 (pricing 0.03$ / 1k tokens for prompts; 0.06 $ / 1k tokens completions) - (on hold, because instruction tuning version is not available)
  • PaLM - (on hold)
  • BARD
    Open-source
  • Cerebras GPT: https://huggingface.co/cerebras/Cerebras-GPT-6.7B
  • Blender for dialogue
@rbroc
Copy link
Owner Author

rbroc commented Mar 15, 2024

see #51: at the end of the whole process, we might want to:

  • update this with the models we are actually using
  • reconsider our current choices (e.g., is using LLaMaChat & Mistral Instruct fair? do we want to include more models?) on the basis of an updated picture of the LLM landscape.

This should be done at the end of the project though, not before - too many new models all the time!

@MinaAlmasi
Copy link
Collaborator

MinaAlmasi commented Oct 14, 2024

Models are sort of "out of date" by now, so we should probably consider new ones. ATM:

Updating LLMs (Mina's scribbles):

Llama3

  • Apparently the 1b llama3 is better than llama2 chat 13b on some tasks?

Maybe stabilityai/stablelm-2-12b-chat ? (Since we are using stabilityai/beluga7b currently.).

  • Whereas Stable Beluga 7b was a fine-tune of Llama. Stablelm seems to be a new model entirely
  • Seems to perform worse than Gemma but better than llama2 and mistral 7b (see link for openLLM leaderboard)

We'll consult Kenneth when we are closer to having a polished pipeline

@rbroc
Copy link
Owner Author

rbroc commented Oct 23, 2024

Some input for this:

  • We need to have a couple of versions of LlaMa2 and Llama3
  • Some Mistral models
  • Zephyr? https://huggingface.co/HuggingFaceH4/zephyr-7b-beta
  • Some Qwen model
    We can define the exact versions right once we rerun the whole pipeline. @rdkm89, if you have input on any class of open-source models that should be included please do chime in.

@rbroc rbroc mentioned this issue Oct 23, 2024
5 tasks
@rdkm89
Copy link
Collaborator

rdkm89 commented Oct 24, 2024

Some input for this:

* We need to have a couple of versions of LlaMa2 and Llama3

* Some Mistral models

* Zephyr? https://huggingface.co/HuggingFaceH4/zephyr-7b-beta

* Some Qwen model
  We can define the exact versions right once we rerun the whole pipeline. @rdkm89, if you have input on any class of open-source models that should be included please do chime in.

I'm not sure that Llama 2 is relevant anymore, I'd probably go for at least 3.1 but preferably 3.2. Likewise, I think that Zephyr is a bit of a dead end.

I think my vote (right now, anyway) would be Llama 3.2, Mistral, Qwen 2, and Gemma 2.

@rbroc
Copy link
Owner Author

rbroc commented Oct 24, 2024

Awesome, let's run with that unless anything mindblowing is released in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants