Skip to content

Conversation

@NicoGrande
Copy link
Collaborator

Description

This PR introduces the MaxTextForCausalLM interface, a general wrapper for all MaxText models that makes them discoverable and invokable by VLLM. Additionally, this PR introduces some changes to model_creation_utils.py to allow NNX models to be initialized with parameters passed in by VLLM.

This PR also includes the packaging code enabling for the MaxTextForCausalLM architecture to be registered into VLLM-TPU as a plugin.

Tests

MaxTextForCausalLM can be registered with VLLM by running the following command in the same environment as VLLM:

cd src/MaxText/integrations/vllm && pip install .

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@NicoGrande NicoGrande force-pushed the nicogrande/maxtext-for-causal-lm branch from e1fe512 to 7330a5e Compare November 6, 2025 05:40
adding vllm.yml

updating valid attention kernels
@NicoGrande NicoGrande force-pushed the nicogrande/maxtext-for-causal-lm branch from 7330a5e to 53225de Compare November 6, 2025 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants