Skip to content

Conversation

@xwu-intel
Copy link
Contributor

We need to enable defrag with contiguous pa for deepseek r1. Add the env to enable defrag.

Signed-off-by: Xiaochang Wu <xiaochang.wu@intel.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for enabling defragmentation in vLLM for Gaudi without requiring unified attention to be enabled, specifically to support DeepSeek R1. The change introduces a VLLM_DEFRAG environment variable that allows users to override the default behavior.

Key changes:

  • Added VLLM_DEFRAG environment variable to control defragmentation feature independently

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions
Copy link

✅ CI Passed

All checks passed successfully against the following vllm commit:
4d01b6428448225807e6605d04e37e29fe729b44

@xuechendi
Copy link
Collaborator

@adobrzyn @michalkuligowski , may you help to review the PR

@xuechendi
Copy link
Collaborator

@xwu-intel , can you give more context, will that be impact for non-mla attn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants