Is your feature request related to a problem? Please describe.
72B dense model is not ideal for most deployments.
Describe the solution you'd like
It would be better for the MiroThinker dataset to be trained on a MoE like qwen3-next 80BA3B
Additional context
Thank you for miroflow and mirothinker, great work all around, this request is truly a request.