Functionalize in-place ops #584

crcrpar · 2024-06-12T14:41:04Z

What does this PR do?

Express in-place ops using their out-of-place counterpart with following prims.copy_. e.g. a.add_(b) as prims.copy_(prims.add(a, b), a), then let the added transform functionalize the trace by removing the trailing copy amd updating the signature.
Let’s say we have t.exp_() in a script and t is used afterwards, thunder translates it into t_out = ltorch.exp_(t). This bound symbol has two sub bound symbols: t0 = ltorch.exp(t) and t_out = prims.copy_(t0, t). The functionalization removes the copy, and replaces ltorch.exp_(t) with t0 = ltorch.exp(t) and t uses after exp_ with t0.

The covered ops are ones that either (a) have in-place variant such as torch.exp and torch.add or (b) have inplace as one of their args such as torch.nn.functional.relu.

So this would not cover the entire #145 broadly, nor take aliases into considerations.

thunder/core/proxies.py

t-vi · 2024-06-12T18:58:46Z

Awesome stuff. The fundamental questions I have about this are

Does this give silent errors? What do we do about this
If we do work on Handling inplace through SSA #145 (and we want to shortly), how will we manage the transition?

crcrpar · 2024-06-12T19:08:02Z

Does this give silent errors? What do we do about this

could you give me some examples of silent errors? I'm not quite following what they would be.

If we do work on Handling inplace through SSA #145 (and we want to shortly), how will we manage the transition?

"the transition" of what? The better coverage by supporting in-place ops?
Anyway I'd like to give it a try as well. I just am not quite clear about what would look better even in the short term

t-vi · 2024-06-12T19:46:11Z

could you give me some examples of silent errors? I'm not quite following what they would be.

Essentially things that rely on aliases (x = thunder.zeros(1, 1); x.diag().fill_(1.0))

"the transition" of what? The better coverage by supporting in-place ops?

So I'm imagining that people will use this as soon as we have it and won't like us to regress on the support for their model.
Relative to #145, I wonder whether this is an alternative long-term solution or whether can we evolve it to a long-term solution or how else this fits into a long-term plan.

thunder/core/jit_ext.py

thunder/core/transform_common.py

mruberry · 2024-06-14T14:38:04Z

I think I'm a little confused. This allows for operations like add_ to appear in traces, and then a pass converts those operations to something like

c = add(a, b)
copy_(a, c)

?

What happens to the copy_ operations?

In general I think we need a design review of how to handle inplace operations before merging PRs related to them.

jjsjann123

One thing we are missing out here is that, in functionalization pass, we should identify input sources.

i.e. if inplace update is applied on non-intermediate buffer created in the trace, we shouldn't functionalize it yet.

If you try to run a batch_norm with your example, you'll notice that num_batches_tracked's update gets functionalized. So that's silent wrong result. We should instead throw a loud error for that case.

thunder/core/transform_common.py

jjsjann123 · 2024-06-14T17:49:09Z

I think I'm a little confused. This allows for operations like add_ to appear in traces, and then a pass converts those operations to something like
c = add(a, b)
copy_(a, c)
?

What happens to the copy_ operations?

In general I think we need a design review of how to handle inplace operations before merging PRs related to them.

I think @crcrpar is only handling inplace on intermediate that can be functionalized away. Which is already help (i.e. inplace activation in resnet)

jjsjann123 · 2024-06-14T17:51:14Z

The example I was referring to in my comment

import thunder
import torch

val = 5

def foo(flag):
    return flag

mod = torch.nn.modules.BatchNorm2d(4)
#mod.track_running_stats = None
mod.cuda()

jfoo = thunder.jit(mod)
 
a = torch.randn(2, 4, 5, 5, device="cuda")
 
print(mod.running_mean) 
print(mod.num_batches_tracked)
print(jfoo(a))  # tensor([1.])
print(mod.running_mean)
print(mod.num_batches_tracked)
 
orig_trace = thunder.last_traces(jfoo)[0]
traces = thunder.last_traces(jfoo)
 
print(f"===\n{traces=}")

I'm not asking you to necessarily support it in this PR, but we should error out instead of silent wrong result.

crcrpar · 2024-06-15T04:41:49Z

If you try to run a batch_norm with your example, you'll notice that num_batches_tracked's update gets functionalized. So that's silent wrong result. We should instead throw a loud error for that case.

8889120 accessed it

crcrpar · 2024-06-16T06:37:36Z

I think I'm a little confused. This allows for operations like add_ to appear in traces, and then a pass converts those operations to something like
c = add(a, b)
copy_(a, c)
?

What happens to the copy_ operations?

ltorch.add_(a, b) is expressed with two subsymbols, they are t_something = add(a, b) and copy_(t_something, a) then the latter is removed by functionalization.

IvanYashchuk · 2024-06-17T11:01:37Z

What happens to the copy_ operations? (#584 (comment))

If it's in the user script it's not traced today and it's not traced in this PR. The only way to introduce copy_ is from within Thunder.

IvanYashchuk · 2024-06-17T11:18:18Z

According to #584 (comment) num_batches_tracked of BatchNorm is correctly updated in-place with an nvFuser region. I hope nvFuser's fusion performance is not suddenly destroyed by this. We've got one benchmark case to try it

lightning-thunder/thunder/benchmarks/targets.py

Line 199 in 21a222b

def test_batch_norm(benchmark, executor: Callable, compute_type: ComputeType):

IvanYashchuk

This PR does two things at once to demonstrate the usability of in-place tracing and generating functional code:

In-place operations like abs_, add_, etc. are added to thunder.torch so that Thunder's Python Interpreter can recognize corresponding PyTorch operations and put them into the initial Thunder trace.
Transform in-place operations on intermediates into out-of-place variants. This is an operator-level transform, there's no interaction between ops. In-place on views is handled in a separate pull request (Partially support in-place ops and tensor aliases #597).

We need part 1, an alternative to part 2 could be freezing the order of in-place operations and let the PyTorch executor to execute them. There could be other good alternatives and the part 2 can easily be disabled if needed in the future.

IvanYashchuk · 2024-06-17T11:23:08Z

thunder/torch/__init__.py

All of the changes to this file look good. There's no other obvious way we can support reading in-place PyTorch operations from user code with Thunder's Python Interpreter.
Independent of how the in-place operations are treated later we need to get them into the initial trace first.

noticed that we might want to add some distributed calls such as torch.distributed.all_reduce, torch.distributed.all_gather_into_tensor, and torch.distributed.reduce_scatter_tensor later.

thunder/tests/opinfos.py

IvanYashchuk · 2024-06-17T11:26:02Z

thunder/core/jit_ext.py

-
+    num_orig_bsyms = len(trace.bound_symbols)
+
+    # note(crcrpar): The path looks neat but it does not work for a trace


Where is this called? Is it still needed now that functionalization is moved into thunder.jit to be applied after the interpreter?

no, great catch. reverted the changes in this file

thunder/core/jit_ext.py

thunder/core/transform_common.py

IvanYashchuk · 2024-06-17T11:28:43Z

thunder/core/transform_common.py

+    """Functionalize in-place ops in ``computation_trace``.
+
+    In thunder, an in-place is an out-of-place or functional op followed by :func:`~thunder.core.prims.copy_`.
+    This function replaces such in-place ops with out-of-place ops.


"... only if the in-place argument is intermediate to the trace", right?

Oh, I see later "functionalization is not applied, if any of an in-place op's arguments are computation_trace.args or computation_trace.kwargs."

Should we error / warn in that case?

it seems that BatchNorm's num_batches_tracked tensor update is expressed as ltorch.add_(num_batches_tracked, 1) and the tensor is an arg. so this makes sense to me. also, if one or more of args & kwargs are updated in an in-place manner, then I guess there's some intention so I'm not inclined to ban such cases

thunder/core/transform_common.py

jjsjann123 · 2024-06-17T16:48:35Z

thunder/core/transform_common.py

+            new_bsyms.append(new_bsym)
+            continue
+        functional_sym_name = new_bsym.sym.id.split(".")[-1][:-1]
+        check(


nitpick: should this be a check, or should we rather just skip?

If is_functionalizable checks for an existing mapping. we shouldn't need this check here.

jjsjann123 · 2024-06-17T17:01:39Z

thunder/core/transform_common.py

+            swap_map[variableify(copy_return)] = copy_from
+            new_bsym.subsymbols = new_bsym.subsymbols[:-1]
+            new_bsym = new_bsym.from_bsym_swap_proxies(swap_map)
+            functional_sym: Symbol = getattr(thunder.torch, functional_sym_name)


The inconsistency here looks bad.

It's fine to choose to only functionalize torch.xxx_ to torch.xxx. But the is_functionalizable thing should have the same logic.

jjsjann123 · 2024-06-17T17:03:14Z

thunder/core/transform_common.py

+
+    def is_functionalizable(bsym: BoundSymbol) -> bool:
+        """Has `OpTags.IN_PLACE` and its args are NOT ``computation_trace.args`` nor ``computation_trace.kwargs``."""
+        return (


~~I don't think we care about having IN-PLACE tag or not here. since the logic below for replacing doesn't take any consideration like that.~~

~~I feel the logic here should just check for torch.xxx_ and see if there is a torch.xxx~~

If we want to move forward with the in_place tag here, maybe we should maintain a map from in_place to out_of_place function, instead of relying on the trailing underscore.

now here's _inplace_to_out_of_place

jjsjann123 · 2024-06-17T17:18:39Z

thunder/core/transform_common.py

+            bsym.sym.tags
+            and prims.OpTags.IN_PLACE in bsym.sym.tags
+            and bsym.subsymbols
+            and bsym.subsymbols[-1].sym.id == prims.PrimIDs.COPY_


we certainly should drop the subsymbols check. This is irrelevant from how this PR is handling functionalization.

This is irrelevant from how this PR is handling functionalization.

why is it irrelevant? Currently in-place bsyms have out-of-place and copy as their subsymbols so I think it fair to check the last sub bound symbol is copy.

how can we tell an appropriate output tensor proxy if a bsym doesn't have a copy_ as its last sub bsym, while avoiding having a lot of new tensor proxy names in a trace?

oops. sorry I read your implementation wrong earlier... I thought we are doing a blind torch.xxx_ to torch.xxx replacement but that's not the case. You actually are only looking at the last subsymbol and replacing that one entry only.

That feels a bit restricted... But a first step is still better then nothing and I'll stop nitpicking on that.

That feels a bit restricted... But a first step is still better then nothing and I'll stop nitpicking on that.

how would it be a bit restricted compared to a blind torch.foo_ to torch.foo replacement?

no that's not what I meant.

By That feels a bit restricted, I'm referring to the alternative that we functionalize directly at the subsymbol prim.copy_ level. But again we don't have to do that in this PR.

crcrpar · 2024-06-17T23:04:31Z

I now think OpTags.IN_PLACE would not work quite well given that there are some functions that takes inplace as their argument such as torch.relu whose ltorch def is

lightning-thunder/thunder/torch/__init__.py

Lines 1458 to 1462 in 8309fc0

    
           @torchsymbol(torch.relu, torch.nn.functional.relu, id="torch.relu", is_method=True) 
        
           def relu(a: TensorLike, /, inplace: bool = False) -> TensorLike: 
        
               utils.check(not inplace, lambda: f"relu only supports inplace=False", exception_type=NotImplementedError) 
        
               return where(a > 0, a, 0)

which I cannot pass this tag separately

jjsjann123 · 2024-06-17T23:30:44Z

I now think OpTags.IN_PLACE would not work quite well given that there are some functions that takes inplace as their argument such as torch.relu whose ltorch def is

lightning-thunder/thunder/torch/__init__.py

Lines 1458 to 1462 in 8309fc0

@torchsymbol(torch.relu, torch.nn.functional.relu, id="torch.relu", is_method=True)

def relu(a: TensorLike, /, inplace: bool = False) -> TensorLike:

utils.check(not inplace, lambda: f"relu only supports inplace=False", exception_type=NotImplementedError)

return where(a > 0, a, 0)

which I cannot pass this tag separately

The IN_PLACE tag is used more like a MAYBE_INPLACE in the implementation. Maybe switching to that?

jjsjann123

My earlier concern has been addressed. Stamping.

thunder/torch/__init__.py

jjsjann123 · 2024-06-18T09:48:27Z

thunder/torch/__init__.py

    utils.check(
        dtypes.is_float_dtype(a.dtype),
        lambda: f"hardswish only supports floating point dtypes, got {a.dtype}",
        exception_type=ValueError,
    )
-    return a * relu6(a + 3) / 6
+    out = a * relu6(a + 3) / 6
+    if inplace:


note to myself, this needs to be made a static constraint.
linking PR #613

crcrpar · 2024-06-18T14:13:25Z

thunder/tests/test_inplace_functionalization.py

+def sample_generator_wrapper(sample_generator, is_silu: bool = False):
+
+    def f(*args, **kwargs):
+        for sample in sample_generator(*args, **kwargs):
+            if not is_silu:
+                yield SampleInput(*(list(sample.args) + [True]), **sample.kwargs)
+            else:
+                # silu treats `inplace` as a kwarg
+                # ref: https://github.com/Lightning-AI/lightning-thunder/commit/335d84c89
+                new_kwargs = {"inplace": True}
+                if sample.kwargs:
+                    new_kwargs.update(sample.kwargs)
+                yield SampleInput(*sample.args, **new_kwargs)
+
+    return f


Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>

thunder/core/transform_common.py

nikitaved · 2024-06-20T11:42:36Z

thunder/core/transform_common.py

+    def is_functionalizable(bsym: BoundSymbol) -> bool:
+        """Has `OpTags.IN_PLACE` and its args are NOT ``computation_trace.args`` nor ``computation_trace.kwargs``."""


nit: is IN_PLACE actually used here? EDIT: yes, implicitly through being added to the map.

Is it also true that the trace args/kwargs are also being checked implicitly somewhere outside?

yes, at https://github.com/Lightning-AI/lightning-thunder/pull/584/files#diff-6f6d1d14b9e8ec7b29e0af14d1c6c621317f36739fd2e8fc3f16ad2387929cc7R443

nikitaved · 2024-06-20T12:45:42Z

thunder/core/transform_common.py

+    intermediate_trace = from_trace(computation_trace)
+    intermediate_trace.bound_symbols = bsyms[:]
+    intermediate_trace.set_provenance(TraceProvenance("Intermediate trace of `functionalize_inplace_ops`"))
+    del bsyms


nit: can't we just do intermediate_tensors.bound_symbols = bsyms?

yes. I didn't want to reuse bsyms

nikitaved · 2024-06-20T12:57:33Z

thunder/core/transform_common.py

+        copy_bsym = bsym.subsymbols[-1]
+        copy_return = copy_bsym.flat_proxy_outs[0]
+        copy_from = copy_bsym.flat_proxy_args[0]
+        copy_to = copy_bsym.flat_proxy_args[1]
+        if copy_to in trace_args_set:
+            new_bsyms.append(new_bsym)
+        else:
+            swap_map[variableify(copy_return)] = copy_from
+            new_bsym.subsymbols = new_bsym.subsymbols[:-1]
+            new_bsym = new_bsym.from_bsym_swap_proxies(swap_map)


This logic looks similar to what step 1 is doing. Couldn't they be merged? It seems like the whole thing could be done in a single pass?

nikitaved · 2024-06-20T13:09:28Z

thunder/core/transform_common.py

+            if optional_inplace_arg_index > -1:
+                flat_args[optional_inplace_arg_index] = False


This probably needs a comment.

nikitaved · 2024-06-20T13:11:38Z

thunder/core/transform_common.py

+                _call_ctx=new_bsym._call_ctx,
+            )
+            new_bsyms.append(new_functional_bsym)
+            bsym_inplace_to_functional[new_bsym] = new_functional_bsym


bsym_inplace_to_functional is never read from?

you're right but I once tried to register this as an attribute of provenance at L473

t-vi · 2024-06-20T11:15:47Z

thunder/__init__.py

@@ -503,6 +509,9 @@ def get_computation_and_inputs(*args, **kwargs):

            prologue_traces = [prologue_trc]
            computation_traces = [computation_trc]
+            if not compile_options.get("skip_inplace_functionalization", False):


Longer term, I wonder if we should have a set of default transformations and this be one of them, but for now it is OK.

t-vi · 2024-06-20T11:17:53Z

thunder/core/langctxs.py

@@ -72,7 +72,7 @@ def resolve_method(id: Any, *args, **kwargs) -> None | Callable:
        # ctx.get_method throws an AttributeError when the context does not have the requested attribute, except
        # for the prims language context, which always throws a ValueError
        method: Callable = ctx.get_method(id, *args, **kwargs)
-    except (AttributeError, ValueError) as e:
+    except (AttributeError, ValueError):


Great catch!

t-vi · 2024-06-20T11:22:47Z

thunder/core/transform_common.py

+    """Functionalize in-place ops in ``computation_trace``.
+
+    In thunder, an in-place is an out-of-place or functional op followed by :func:`~thunder.core.prims.copy_`.
+    This function replaces such in-place ops with out-of-place ops.


Should we error / warn in that case?

t-vi · 2024-06-20T11:24:37Z

thunder/core/transform_common.py

+        bsyms.append(new_bsym)
+
+    intermediate_trace = from_trace(computation_trace)
+    intermediate_trace.bound_symbols = bsyms[:]


Nit: do we need to copy if we del below?

t-vi · 2024-06-20T13:17:05Z

thunder/core/langctxs.py

@@ -72,7 +72,7 @@ def resolve_method(id: Any, *args, **kwargs) -> None | Callable:
        # ctx.get_method throws an AttributeError when the context does not have the requested attribute, except
        # for the prims language context, which always throws a ValueError
        method: Callable = ctx.get_method(id, *args, **kwargs)
-    except (AttributeError, ValueError) as e:
+    except (AttributeError, ValueError):


Great catch

t-vi · 2024-06-20T13:17:36Z

thunder/__init__.py

@@ -503,6 +509,9 @@ def get_computation_and_inputs(*args, **kwargs):

            prologue_traces = [prologue_trc]
            computation_traces = [computation_trc]
+            if not compile_options.get("skip_inplace_functionalization", False):
+                computation_traces.extend(functionalize_inplace_ops(computation_trace=computation_trc))
+                computation_trc = computation_traces[-1]


Longer term, I wonder if this should be a "default transform", but maybe it is important that this goes first and so it is tricky with the timing.

t-vi · 2024-06-20T13:18:24Z

thunder/core/transform_common.py

+    In thunder, an in-place is an out-of-place or functional op followed by :func:`~thunder.core.prims.copy_`.
+    This function replaces such in-place ops with out-of-place ops.
+    Note that functionalization is not applied, if any of an in-place op's arguments are
+    ``computation_trace.args`` or ``computation_trace.kwargs``.


I wonder what we should do in these cases, though, warn, error?

t-vi · 2024-06-20T13:18:59Z

thunder/core/transform_common.py

+        bsyms.append(new_bsym)
+
+    intermediate_trace = from_trace(computation_trace)
+    intermediate_trace.bound_symbols = bsyms[:]


Nit: I don't think we strictly need the copy here.

t-vi · 2024-06-20T13:22:18Z

thunder/tests/test_inplace_functionalization.py

@@ -0,0 +1,124 @@
+from __future__ import annotations


Great to have tests for it!

t-vi

Supergood! We'll need to make things safer, but this is an awesome start.
Thank you @IvanYashchuk @jjsjann123 @nikitaved @mruberry for your reviews and comments.

nikitaved · 2024-06-20T14:34:33Z

thunder/torch/__init__.py

+@torchsymbol(torch.Tensor.abs_, is_method=True, tags=(prims.OpTags.IN_PLACE,))
+def abs_(a: NumberLike | TensorLike, /) -> Number | TensorLike:
+    return prims.copy_(abs(a), a)


nit: maybe adding a decorator for an out-of-place op to register an in-place counterpart could be cleaner if realizable? There we could also populate the map if needed.

nikitaved · 2024-06-20T14:37:31Z

thunder/tests/test_inplace_functionalization.py

+for op in opinfos:
+    if not (op.op in _functional_to_inplace or op.op in _functional_to_functional_with_inplace_arg):
+        continue


nit: maybe in some future we would like to have a flag to test in-place ops in opinfos for out-of-place OpInfos?

nikitaved · 2024-06-20T14:39:33Z

thunder/tests/test_inplace_functionalization.py

+            sample_input_generator=(
+                op.sample_input_generator if op.name != "masked_fill" else inplace_masked_fill_sample_generator
+            ),


nit: seems this motivates the comment from above. We could set test_in_place=False for masked_fill and create a separate OpInfo for the in-place variant.

nikitaved · 2024-06-20T14:43:18Z

thunder/tests/test_inplace_functionalization.py

+@ops(_inplace_opinfos, supported_dtypes=(dtypes.float32,))
+def test_functionalization(op: OpInfo, device: str, dtype: dtypes.dtype, executor, _):
+    import thunder


Seems like we might be missing some test with a sequence of in-places ops? Just to be sure that the bsym replacement logic is sound across multiple symbols, not just a single one. Or is it not for this PR?

Seems like we might be missing some test with a sequence of in-places ops?

yes, this pr doesn't have such tests.

Just to be sure that the bsym replacement logic is sound across multiple symbols, not just a single one. Or is it not for this PR?

Locally I've been using the following snippet so I hope the functionalization works. I just didn't have a clear picture of designing tests with a sequence of in-place ops.

import torch import thunder def f(a: torch.Tensor, b: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]: a += b c = torch.exp(a) d = torch.tanh(b) c += d d.div_(a) e = c + d f = torch.nn.functional.relu(e, inplace=True) g = a + b return f, c, d, torch.relu_(g) def main(): a, b = [torch.randn((2, 2), device="cuda", requires_grad=False) for _ in range(2)] a_, b_ = a.clone().detach(), b.clone().detach() jit_f = thunder.jit(f, executors=[thunder.executors.get_torch_executor()]) c, d, e, g = jit_f(a, b) c_, d_, e_, g_ = f(a_, b_) traces = thunder.last_traces(jit_f) print(traces[-1]) torch.testing.assert_close(d, d_) torch.testing.assert_close(c, c_) torch.testing.assert_close(e, e_) torch.testing.assert_close(g, g_) if __name__ == "__main__": main()

We could add some handpicked tests for sure. We can always throw some line duplications for numerically more stable ops :)

apaz-cli · 2024-06-20T20:30:24Z

Glorious.

crcrpar commented Jun 12, 2024

View reviewed changes

thunder/core/proxies.py Outdated Show resolved Hide resolved

crcrpar force-pushed the crpa/inplace-support branch from 0165e23 to fe589dc Compare June 13, 2024 12:43

IvanYashchuk reviewed Jun 13, 2024

View reviewed changes

thunder/core/jit_ext.py Outdated Show resolved Hide resolved

crcrpar force-pushed the crpa/inplace-support branch 2 times, most recently from 6d9a996 to c735f1f Compare June 14, 2024 07:04

crcrpar mentioned this pull request Jun 14, 2024

Partially support in-place ops and tensor aliases #597

Merged

mruberry reviewed Jun 14, 2024

View reviewed changes

thunder/core/transform_common.py Outdated Show resolved Hide resolved

jjsjann123 requested changes Jun 14, 2024

View reviewed changes

thunder/core/transform_common.py Outdated Show resolved Hide resolved

thunder/core/transform_common.py Outdated Show resolved Hide resolved

crcrpar force-pushed the crpa/inplace-support branch from c17b4d8 to f6dc67b Compare June 15, 2024 07:04

crcrpar force-pushed the crpa/inplace-support branch from aeadbb5 to d9a1275 Compare June 16, 2024 16:24

IvanYashchuk approved these changes Jun 17, 2024

View reviewed changes

crcrpar force-pushed the crpa/inplace-support branch from d9a1275 to 4238e1c Compare June 17, 2024 12:03

jjsjann123 reviewed Jun 17, 2024

View reviewed changes

crcrpar force-pushed the crpa/inplace-support branch from f12bf41 to d6fe101 Compare June 18, 2024 03:14

jjsjann123 approved these changes Jun 18, 2024

View reviewed changes

crcrpar force-pushed the crpa/inplace-support branch from 300256f to bd36039 Compare June 18, 2024 12:36

crcrpar commented Jun 18, 2024

View reviewed changes

return t not return torch_func(...) for clarity

d4402b0

Signed-off-by: Masaki Kozuki <mkozuki@nvidia.com>

crcrpar force-pushed the crpa/inplace-support branch from 4fb5b9a to d4402b0 Compare June 20, 2024 07:29

crcrpar marked this pull request as ready for review June 20, 2024 07:30

crcrpar requested review from lantiga, robieta and t-vi as code owners June 20, 2024 07:30

nikitaved self-requested a review June 20, 2024 11:30

nikitaved reviewed Jun 20, 2024

View reviewed changes

thunder/core/transform_common.py Show resolved Hide resolved

nikitaved reviewed Jun 20, 2024

View reviewed changes

t-vi reviewed Jun 20, 2024

View reviewed changes

thunder/tests/test_inplace_functionalization.py

@@ -0,0 +1,124 @@

from __future__ import annotations

Copy link

Collaborator

t-vi Jun 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to have tests for it!

t-vi approved these changes Jun 20, 2024

View reviewed changes

t-vi merged commit e28ea5e into main Jun 20, 2024
39 checks passed

t-vi deleted the crpa/inplace-support branch June 20, 2024 13:26

nikitaved reviewed Jun 20, 2024

View reviewed changes

crcrpar mentioned this pull request Jun 20, 2024

Disallow in-place to view tensors #630

Merged

t-vi mentioned this pull request Jun 21, 2024

Support for torchvision models, e.g., a simple ViT #93

Closed

tfogal mentioned this pull request Jul 9, 2024

Support NeMo NeVA Model #343

Open

crcrpar added the in-place label Aug 9, 2024


		num_orig_bsyms = len(trace.bound_symbols)

		# note(crcrpar): The path looks neat but it does not work for a trace

		def is_functionalizable(bsym: BoundSymbol) -> bool:
		"""Has `OpTags.IN_PLACE` and its args are NOT ``computation_trace.args`` nor ``computation_trace.kwargs``."""

		if optional_inplace_arg_index > -1:
		flat_args[optional_inplace_arg_index] = False

Functionalize in-place ops #584

Functionalize in-place ops #584

Conversation

crcrpar commented Jun 12, 2024 • edited Loading

What does this PR do?

t-vi commented Jun 12, 2024

crcrpar commented Jun 12, 2024

t-vi commented Jun 12, 2024

mruberry commented Jun 14, 2024

jjsjann123 left a comment

Choose a reason for hiding this comment

jjsjann123 commented Jun 14, 2024

jjsjann123 commented Jun 14, 2024 • edited by IvanYashchuk Loading

crcrpar commented Jun 15, 2024 • edited Loading

crcrpar commented Jun 16, 2024 • edited Loading

IvanYashchuk commented Jun 17, 2024

IvanYashchuk commented Jun 17, 2024

IvanYashchuk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jjsjann123 Jun 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

crcrpar commented Jun 17, 2024

jjsjann123 commented Jun 17, 2024

jjsjann123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

t-vi left a comment

Choose a reason for hiding this comment

nikitaved Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

apaz-cli commented Jun 20, 2024

crcrpar commented Jun 12, 2024 •

edited

Loading

jjsjann123 commented Jun 14, 2024 •

edited by IvanYashchuk

Loading

crcrpar commented Jun 15, 2024 •

edited

Loading

crcrpar commented Jun 16, 2024 •

edited

Loading

jjsjann123 Jun 17, 2024 •

edited

Loading

nikitaved Jun 20, 2024 •

edited

Loading

nikitaved Jun 20, 2024 •

edited

Loading

nikitaved Jun 20, 2024 •

edited

Loading

nikitaved Jun 20, 2024 •

edited

Loading

nikitaved Jun 20, 2024 •

edited

Loading