Attach cached cudagraph callable to the transform #1001

t-vi · 2024-08-20T08:41:14Z

This introduces a CUDAGraphRunner class to bundle building and caching cudagraphs and attach them to the CUDAGraphTransform introduced in #977 .

(Design issue #981 )

From my POV this is ready for review, but it builds on #977, so I'm labeling it draft.

for more information, see https://pre-commit.ci

t-vi · 2024-08-20T08:48:53Z

@nikitaved @IvanYashchuk

IvanYashchuk

The code is organized a bit differently now. Looking forward to where it will take us!

thunder/transforms/cudagraph.py

IvanYashchuk · 2024-08-20T11:08:28Z

thunder/transforms/cudagraph.py

+        self.python_callables[x_fn_name] = (
+            self.make_python_callable_from_symbols(fn_name, bsyms, inputs, outputs),
+            static_inputs_mask,
+        )
+        self.trace_symbols[x_fn_name] = (bsyms, inputs, outputs)


How can I access these dictionaries in the regular user script? How should developers get to know that this information is saved and available for inspection?

Currently, you have to look at the source. I'm not yet sure that this is the exact information we want to keep, we should refine this.

so in addition to everything else the x_fn_name is a bit stupid, the trick was that the transform is applied to forward and backward. If we are OK with cudagraph regions not starting from 1 in the backward, I would just generate the name in the runner and return it.
This would allow a link from trace->cache name here easily. Is that what you have in mind?

IvanYashchuk · 2024-08-20T11:19:23Z

thunder/transforms/cudagraph.py

+    def build_cuda_graph(
+        self, fn: Callable, args: list[any], static_args_mask: tuple[bool, ...]
+    ) -> tuple[torch.cuda.CUDAGraph, Sequence[torch.Tensor | Any], Sequence[torch.Tensor | Any]]:


What is the benefit of having this as a method and not as a free function?

My thinking was that having the ability to override the buffer allocation (get_static_buffer) could get useful. We could have it as a callback instead but then it might just as well be a method.

lantiga

Looks great

lantiga · 2024-08-20T11:44:16Z

thunder/transforms/cudagraph.py

-            static_inputs_mask = (True,) * self.num_static_inputs + (False,) * (len(args) - self.num_static_inputs)
-        else:
-            static_inputs_mask = tuple(isinstance(arg, torch.nn.Parameter) for arg in args)
+    def make_python_callable_from_symbols(


This looks like something we could have in a transform building block library in the future

Indeed! The question is whether we should build the region before...

t-vi · 2024-08-20T12:23:03Z

So I'm merging this now, I expect to do another PR or two in the next few days around the dev x of this + any further feedback.
(One other thing I want to do is perhaps pass in the static argument mask which we might obtain from analysing the graph(s) - parameters+buffers from the module are static etc.)

t-vi and others added 7 commits August 16, 2024 13:51

make cudagraphs a transform

1190271

Merge branch 'main' into tom/cudagraphs_transform

7adfca9

[pre-commit.ci] auto fixes from pre-commit.com hooks

c071a25

for more information, see https://pre-commit.ci

Merge branch 'main' into tom/cudagraphs_transform

d526304

attach cudagraph cache to the transform

591ce4a

Merge branch 'main' into tom/cudagraphs_transform

9598998

Merge branch 'tom/cudagraphs_transform' into tom/cudagraphs-cache

4907793

t-vi mentioned this pull request Aug 20, 2024

CUDAGraphs in Thunder #981

Open

IvanYashchuk approved these changes Aug 20, 2024

View reviewed changes

IvanYashchuk changed the title ~~attach cudagraph cache to the transform~~ Attach cached cudagraph callable to the transform Aug 20, 2024

IvanYashchuk added the cudagraphs label Aug 20, 2024

Base automatically changed from tom/cudagraphs_transform to main August 20, 2024 11:42

lantiga approved these changes Aug 20, 2024

View reviewed changes

Merge branch 'main' into tom/cudagraphs-cache

10e941d

t-vi marked this pull request as ready for review August 20, 2024 11:52

t-vi requested a review from mruberry as a code owner August 20, 2024 11:52

t-vi merged commit 7c425fe into main Aug 20, 2024
40 checks passed

t-vi deleted the tom/cudagraphs-cache branch August 20, 2024 12:23

t-vi mentioned this pull request Aug 21, 2024

follow-up for cudagraphs changes #1009

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Attach cached cudagraph callable to the transform #1001

Attach cached cudagraph callable to the transform #1001

t-vi commented Aug 20, 2024

t-vi commented Aug 20, 2024

IvanYashchuk left a comment

IvanYashchuk Aug 20, 2024

t-vi Aug 20, 2024

t-vi Aug 20, 2024

IvanYashchuk Aug 20, 2024

t-vi Aug 20, 2024 •

edited

Loading

lantiga left a comment

lantiga Aug 20, 2024

t-vi Aug 20, 2024

t-vi commented Aug 20, 2024 •

edited

Loading

Attach cached cudagraph callable to the transform #1001

Attach cached cudagraph callable to the transform #1001

Conversation

t-vi commented Aug 20, 2024

t-vi commented Aug 20, 2024

IvanYashchuk left a comment

Choose a reason for hiding this comment

IvanYashchuk Aug 20, 2024

Choose a reason for hiding this comment

t-vi Aug 20, 2024

Choose a reason for hiding this comment

t-vi Aug 20, 2024

Choose a reason for hiding this comment

IvanYashchuk Aug 20, 2024

Choose a reason for hiding this comment

t-vi Aug 20, 2024 • edited Loading

Choose a reason for hiding this comment

lantiga left a comment

Choose a reason for hiding this comment

lantiga Aug 20, 2024

Choose a reason for hiding this comment

t-vi Aug 20, 2024

Choose a reason for hiding this comment

t-vi commented Aug 20, 2024 • edited Loading

t-vi Aug 20, 2024 •

edited

Loading

t-vi commented Aug 20, 2024 •

edited

Loading