Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New Feature] Is Mixtral supported? #879

Open
markusdr opened this issue Jul 8, 2024 · 1 comment
Open

[New Feature] Is Mixtral supported? #879

markusdr opened this issue Jul 8, 2024 · 1 comment

Comments

@markusdr
Copy link

markusdr commented Jul 8, 2024

Can you confirm if Mixtral is currently supported, e.g., mistralai/Mixtral-8x7B-Instruct-v0.1? I saw in another issue that Mistral is supported, but I'm not sure about Mixtral-8x7B since it's a different architecture.

@research4pan
Copy link
Contributor

research4pan commented Jul 9, 2024

Thanks for your interest in LMFlow! We have tested Mixtral-8x7B in A40 (48G)*8 servers, so the dense training of mixtral-8x7B is currently supported in LMFlow. Sparse training is still under implementation, which we will add to our roadmap and schedule the implementation soon. Multi-node (https://github.com/OptimalScale/LMFlow/blob/main/readme/multi_node.md) can be utilized for larger model training such as Mixtral-8x22B, but we haven't yet tested models that large.

Hope this information can be helpful 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants