Skip to content

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Mar 19, 2025

What does this PR do ?

Fixes #503

As suggested by #503 (comment) casts all pointers to tl.int64. This fixes the CUDA illegal memory access issue with large batch size & prefill size

@tridao

@younesbelkada younesbelkada changed the title fix: fix large-bs issue fix: fix large batch size & prefill size issue Mar 19, 2025
younesbelkada and others added 3 commits March 19, 2025 12:22
Co-authored-by: LuJunru <LuJunruuser.noreply.github.com>
Co-authored-by: LuJunru <LuJunru@user.noreply.github.com>
Co-authored-by: LuJunru <LuJunru@users.noreply.github.com>
@The-Obstacle-Is-The-Way

This should be merged as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CUDA error when using Mamba2 with long context

2 participants