Skip to content

[OpenVINO] LFM2-MoE support for transformers v4.#1609

Open
popovaan wants to merge 16 commits intohuggingface:mainfrom
popovaan:lfm2_moe
Open

[OpenVINO] LFM2-MoE support for transformers v4.#1609
popovaan wants to merge 16 commits intohuggingface:mainfrom
popovaan:lfm2_moe

Conversation

@popovaan
Copy link
Collaborator

@popovaan popovaan commented Feb 10, 2026

What does this PR do?

Added LFM2-MoE support for transformers v4.

Before submitting

  • [N/A] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [N/A] Did you make sure to update the documentation with your changes?
  • [N/A as the model is private.] Did you write any new necessary tests?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

hidden_states_expanded = hidden_states_expanded.view(num_experts, -1, hidden_dim) # (num_experts, num_tokens, hidden_dim)

# Stack expert parameters
w1_stacked = torch.stack([e.w1.weight.T for e in self.experts])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also transpose weight, right? May be not needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, without it shapes don't match.
I see that in Afmoe these weights are transposed as well, but concatenated beforehand. I can implement it the same way https://github.com/huggingface/optimum-intel/blob/2c48d6430c265ac259c1b264f3e2c4025cdd7b76/optimum/exporters/openvino/model_patcher.py#L7604C16-L7611C18

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@popovaan, btw, can we re-use existing MoE patching for this model?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest patching to this version of MoE is Qwen3 MoE block, but it has a slightly diffrent preprocessing of routing_weights (sofmax is applied instead of dividing by sum), so I suppose we can't reuse it as is.

@popovaan popovaan changed the title Draft LFM2-MoE support. [OpenVINO] LFM2-MoE support for transformers v4. Feb 18, 2026
@popovaan popovaan marked this pull request as ready for review February 18, 2026 11:32
@popovaan popovaan requested a review from rkazants February 18, 2026 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants