Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDAGraphs in Thunder #981

Open
t-vi opened this issue Aug 16, 2024 · 3 comments
Open

CUDAGraphs in Thunder #981

t-vi opened this issue Aug 16, 2024 · 3 comments
Labels
cudagraphs design This is a largish feature / design enhancement New feature or request thunderfx for things that could be applicable to the dynamo+thunder frontend

Comments

@t-vi
Copy link
Collaborator

t-vi commented Aug 16, 2024

PR #977 proposes to make CUDAGraphs a Transform with the cudagraphs being created in the post_optimization phase of creating the executable. This in particular enables people to subclass or copy the CUDAGraphs transform to have their own, bespoke thing.

We do have some ideas (and this is just a list from the top of my head to get things started), but this issue is to collect discussion and ideas about how you want CUDAGraphs to look like.

static input detection

Currently we auto-mark parameters, but might be able to do more, e.g. buffers.
If we knew things about the things between fusions, we might know if we can re-use input buffers.

caching

In particular we would not want to use a global cache, but a per-transform one, to let things go out of scope.
Also we want to enable users to potentially override the caching (we found that it can be interesting to re-use cudagraph fusions in case we have some non-grapheable bit in a repeated block of a model).
A step here is PR #1001

operator selection for what is cuda-graphable

I think it would be cool if symbols in general had a tag whether they are suitable for CUDAGraph inclusion.
After #977 we have the advantage that if people have particular ideas about this, they can just subclass and override can_fuse.

check composability

Test eg. fsdp + cudagraph
I'm guessing the communication prims don't mix well with cudagraphs? but that might be with operator tagging...

(not immediate) future: memory information about ops

CUDAGraphs could enormously benefit by more information about the memory effects of operators (so similar to meta info but about what allocations a function makes, where the output is allocated etc.), but that is in the future.

Obiviously other people @nikitaved @IvanYashchuk @tfogal and more will be much more knowledgeable, but so here is a start.

@t-vi t-vi added enhancement New feature or request cudagraphs design This is a largish feature / design labels Aug 16, 2024
@mruberry
Copy link
Collaborator

triage review:

  • let's set aside to talk about this more
  • how do checks in the prologue extend to handle strides?

@t-vi
Copy link
Collaborator Author

t-vi commented Aug 20, 2024

We will have a discussion on this tomorrow (Aug 21) in the Thunder Tech sessions.
Please ping me if you want an invite.

how do checks in the prologue extend to handle strides?

For now, the strides are checked in the CUDAGraph caching itself (per the arg descriptor). In fact, they are "over-checked" in the sense that we also check them for those args that are copied and so do not matter...

@tfogal tfogal added the thunderfx for things that could be applicable to the dynamo+thunder frontend label Aug 23, 2024
@t-vi
Copy link
Collaborator Author

t-vi commented Aug 26, 2024

Outcomes of the design discussion: (https://docs.google.com/presentation/d/1JhmJqikAneYDv4S-g9rcAOS9xy35FztydE-PzxFhi3w/edit?usp=sharing)

  • tag proxies as static memory location
  • tag operators as safe or not
  • bubble up constraints like strides? (maybe the dynamic shape work will show us the way) or update automagically (recently in cudagraphs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudagraphs design This is a largish feature / design enhancement New feature or request thunderfx for things that could be applicable to the dynamo+thunder frontend
Projects
None yet
Development

No branches or pull requests

3 participants