Supporting In-Place Operations on Constant Tensors in Torch-TRT #1474

gs-olive · 2022-11-23T00:14:26Z

gs-olive
Nov 23, 2022
Collaborator

Context

When using Torch-TensorRT to build models where the forward function populates a tensor on-the-fly via inplace operations, the results can sometimes differ from their TorchScript counterpart. As demonstrated in Issue #1453, creating a tensor in the forward function, populating its entries, then returning it can have unexpected results. Below is a minimal sample on a simple function:

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()

    def forward(self, y):
        x = torch.zeros((1,)).to("cuda:0")
        x[0] = y[0]
        return x

model = Model().eval().cuda()
scripted_model = torch.jit.script(model)
trt_ts_module = torch_tensorrt.ts.compile(scripted_model,
                                          inputs=[torch_tensorrt.Input((1,))],
                                          enabled_precisions={torch.float},
                                          require_full_compilation=True)

input_data = torch.ones((1,), device="cuda:0")
pytorch_out = model.forward(input_data)
torch_tensorrt_out = trt_ts_module(input_data)
print("PyTorch Result:\n", pytorch_out)
print("Torch-TRT Result:\n", torch_tensorrt_out)

"""
PyTorch Result:
 tensor([1.], device='cuda:0')
Torch-TRT Result:
 tensor([0.], device='cuda:0')
"""

The above sample generates the following IR code, for which the problematic tensor is %x.1:

DEBUG: [Torch-TensorRT TorchScript Conversion Context] - graph(%y.1 : Tensor):
  %6 : bool = prim::Constant[value=0]()
  %5 : Device = prim::Constant[value="cuda:0"]()
  %4 : NoneType = prim::Constant()
  %3 : int = prim::Constant[value=0]() # examples/custom_converters/toy_model.py:79:21
  %2 : int[] = prim::Constant[value=[1]]()
  %7 : Tensor = aten::zeros(%2, %4, %4, %4, %4) # examples/custom_converters/toy_model.py:78:16
  %x.1 : Tensor = aten::to(%7, %5, %4, %6, %6) # examples/custom_converters/toy_model.py:78:16
  %9 : Tensor = aten::select(%y.1, %3, %3) # examples/custom_converters/toy_model.py:79:19
  %10 : Tensor = aten::select(%x.1, %3, %3) # examples/custom_converters/toy_model.py:79:12
  %11 : Tensor = aten::copy_(%10, %9, %6) # examples/custom_converters/toy_model.py:79:12
  return (%x.1)

The tensor %x.1 is initialized as a compile-time constant tensor in Torch-TRT, and thus all "in-place" operations applied to it do not change its value. Specifically, the aten::select and aten::copy_ operations' outputs are immediately discarded, as the original tensor (in TensorRT execution) is unchanged, whereas in TorchScript execution, these operations modify this tensor.

Temporary Solutions

1. One way to circumvent this issue is to pass as an argument to the forward function, the tensor x which is currently initialized in the forward code. If the tensor is initialized outside of the function and is passed in at runtime, the inplace operations work as expected in TorchTRT, and the results agree.

2. Another alternative fix is to avoid the inplace operations altogether by making copies of the tensor being assigned via the torch.Tensor.index_select and accompanying aten::index_select operators. By making a copy of the constant tensor with the desired values inserted, one is able to avoid the issue presented by using these inplace operations on constant tensors.

Discussion

In the above snippet, one solution seems to be simply returning the last-modified Tensor instead of the inplace version (returning %11 instead of %x.1). While this can be done via a lowering pass, it will not work in the general case, as sometimes values can be assigned to a slice, and the resulting tensor output will only be a fraction of the initial tensor intended to be modified. For example, an operation such as x[:, 0] = y[:, 0] will lead to an aten::slice followed by an aten::index_put_, for which the output of the latter might not be the same shape as x.

Another option could be to investigate more general graph modifications to deal with inplace operations. For example, if an inplace operation is detected on a constant tensor during the lowering phase, the graph can be modified accordingly to make a modifiable copy of the constant tensor, or potentially to eliminate the inplace calls in favor of copy-inducing calls. The inplace-call elimination is made more challenging by the use of aten::slice to write to a view of the initial tensor.

Implementation Phases

Prototype - L

Needs further discussion on feasibility of solutions and cost/benefit of supporting inplace operations on constant Tensors.

Feedback and Suggestions

Feedback on the feasibility of the above resolution options and suggestions for alternative ideas to resolve this issue are welcome!

gs-olive · 2022-11-30T23:15:05Z

gs-olive
Nov 30, 2022
Collaborator Author

Some Additional Information

Torch provides many useful JIT passes which are helpful in rewriting in-place ops, but I have not found one that deals with the following case:

$$\begin{align} x &= \begin{bmatrix}0&0\\0&0\\0&0\end{bmatrix};\ \ \ y = \begin{bmatrix}-1&-2\\-3&-4\\-5&-6\end{bmatrix};\ \ \ idx = \begin{bmatrix}0& 2\end{bmatrix} \\x[idx, :] &= y[idx, :] \\\Longrightarrow x &= \begin{bmatrix}-1&-2\\0&0\\-5&-6\end{bmatrix} \end{align}$$

The issue with the above formulation is that TorchScript produces the following IR for the forward pass:

def forward(self, y):
    x = torch.zeros((3, 2)).float().cuda()
    idx = torch.tensor([0, 2]).int().cuda()
    x[idx.long(), :] = y[idx.long(), :]
    return x

# GENERATES #

  %x.1 : Tensor = aten::cuda(%9)
  %11 : Tensor = aten::slice(%y.1, %7, %4, %4, %7)
  %13 : Tensor = aten::slice(%x.1, %7, %4, %4, %7)
  %26 : Tensor = aten::index(%11, %2)
  %14 : Tensor = aten::index_put_(%13, %2, %26, %6)
  return (%x.1)

In the above scenario, returning %14 would be incorrect, as it is an operation applied to a slice of tensor %x.1, thus the output will not have the same shape as %x.1. Additional logic would be needed here to translate this block of slice and index operators to function on the original tensor and not on a view of it. It is key that x is initialized inside the forward function, as this ensures the tensor will be labeled as a constant by Torch-TRT, and is thus immutable. A potential alternative might be to allow mutability for constant tensors in Torch-TRT, much the same way as these tensors can be in-place modified in Torch.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Supporting In-Place Operations on Constant Tensors in Torch-TRT #1474

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Supporting In-Place Operations on Constant Tensors in Torch-TRT #1474

gs-olive Nov 23, 2022 Collaborator

Context

Temporary Solutions

Discussion

Implementation Phases

Prototype - L

Feedback and Suggestions

Replies: 1 comment

gs-olive Nov 30, 2022 Collaborator Author

Some Additional Information

gs-olive
Nov 23, 2022
Collaborator

gs-olive
Nov 30, 2022
Collaborator Author