Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for CUDA kernels #70

Closed
Andrei-Aksionov opened this issue Mar 25, 2024 · 9 comments
Closed

Support for CUDA kernels #70

Andrei-Aksionov opened this issue Mar 25, 2024 · 9 comments
Assignees
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@Andrei-Aksionov
Copy link

🚀 Feature

Hi there 👋

From the main readme file I noticed that Thunder except custom kernels, but only the ones that are written in Trition.
Is there a plan to support CUDA kernels?

Motivation

I'm only in the beginning of the custom kernels journey, so I might misunderstand something.

From what I saw online, there are many of highly optimized CUDA kernels already available (since CUDA has been around for quite a while). Plus, there is a high chance that someone with a lot of experience in writing CUDA kernels (but not Trition) want's to use Thunder (or even integrate into an existing project).

I personally would like to write custom CUDA kernels for the LitGPT repo after I finish reading PMPP book.

@Andrei-Aksionov Andrei-Aksionov added enhancement New feature or request help wanted Extra attention is needed labels Mar 25, 2024
@IvanYashchuk
Copy link
Collaborator

Hello Andrei,

Thunder can work with any custom kernel, not just the ones written in Triton. Any function that accepts and returns PyTorch tensors can be registered to work with Thunder. Here's a tutorial on connecting CUDA kernels with the PyTorch interface: https://pytorch.org/tutorials/advanced/cpp_extension.html
Once registered in PyTorch these CUDA extensions can be registered in Thunder with OperatorExecutor.register_implementation.

We have one example executor using cross_entropy CUDA kernel from the Apex project:

@Andrei-Aksionov
Copy link
Author

Hey Ivan,

Thanks for the answer.

Any function that accepts and returns PyTorch tensors can be registered to work with Thunder.

Sounds promising.

Maybe the Readme file should reflect this too, what do you think?

...

- TransformerEngine
- PyTorch eager
- custom kernels, including those written with OpenAI Triton and Nvidia CUDA

In fact, any function that accepts and returns PyTorch tensors can be registered to work with Thunder, what makes it compatible with any custom kernel.

...

?

@lantiga
Copy link
Collaborator

lantiga commented Mar 25, 2024

@IvanYashchuk I was thinking whether it would make sense to have a PyCUDA option as well. This would give us an option to be more decoupled from PyTorch's extension mechanism (and the C++ ABI).

@IvanYashchuk
Copy link
Collaborator

PyTorch supports CUDA Array Interface and any project that works with this interface can accept and write to PyTorch Tensors including PyCUDA (https://documen.tician.de/pycuda/tutorial.html#interoperability-with-other-libraries-using-the-cuda-array-interface), Numba (https://numba.readthedocs.io/en/stable/cuda/kernels.html), CuPy (https://docs.cupy.dev/en/stable/user_guide/kernel.html), and others.

@lantiga
Copy link
Collaborator

lantiga commented Mar 25, 2024

That's right! I think a tutorial that shows how to add a CUDA kernel using the CUDA array interface without necessarily having to build a PyTorch extension would be great /cc @t-vi

@t-vi
Copy link
Collaborator

t-vi commented Mar 25, 2024

Yeah, if anyone has suggestions for a great cuda kernel, I'll take them, or I ask the people on cuda mode...

@t-vi t-vi self-assigned this Apr 1, 2024
@t-vi
Copy link
Collaborator

t-vi commented Apr 1, 2024

I have made the demo for this weeks cuda mode lecture with cuda-python and that seemed to work well enough that I'd make it into a Thunder example.

@Andrei-Aksionov
Copy link
Author

You are talking about Flash Attention lecture (haven't seen it yet)?
If it's, so I think it would be a cool (and somewhat flashy) example.

@t-vi
Copy link
Collaborator

t-vi commented Jun 25, 2024

We have https://github.com/Lightning-AI/lightning-thunder/blob/main/notebooks/extend_thunder_with_cuda_python.ipynb now, so I'm closing this.

@t-vi t-vi closed this as completed Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants