`prims.div`: equalize NVFuser prims with PyTorch for integer Tensors #821

nikitaved · 2024-07-22T12:11:30Z

NVFuser implementation of prims.div used reciprocal. That caused upcasting for integer inputs and, ultimately, rendered meta checks incorrect.

Fixes #808
Fixes #818

nikitaved · 2024-07-22T12:33:10Z

It is part one - the fix for integer types. We also have issues with floating types that need to get fixed to re-enable all the consistency tests.

thunder/tests/test_nvfuser.py

t-vi

Thank you @nikitaved, awesome stuff!

mruberry · 2024-07-22T15:18:04Z

thunder/tests/test_nvfuser.py

+        return thunder.prims.div(a, b)
+
+    def truediv(a, b):
+        return a // b


This is floor division, not true division

Good catch! Made a typo - will fix that. Thank you!

C division has always been the true division for me :)

mruberry · 2024-07-22T15:20:45Z

thunder/tests/test_nvfuser.py

+    for f in (thunder.jit(div), thunder.jit(truediv)):
+        rout = f(x.cpu(), y.cpu()).to(device)
+        jout = f(x, y)
+        assert rout.equal(jout)


I don't think these results should be equal (I think we're exposing a bug in the nvFuser implementation of prims.div). prims.div should implement truncation division, which is not the same as Python's floor division. See, for example, the PyTorch executor's implementation of prims.div:

lightning-thunder/thunder/executors/torchex.py

Line 907 in 44d2f3c

def _div_prim_impl(a: Number | torch.Tensor, b: Number | torch.Tensor) -> torch.Tensor:

fyi @kevinstephano @jjsjann123

div calls to prims.div, not ltorch.div, so they should be equivalent.

Yes, but doesn't truediv call floor division?

Floor division has a decomp that is more than just prims.div, and this is what is being tested here.

No, but it's a common misconception that they are. If the result is negative then they round differently. Floor division rounds down, truncation division rounds towards zero.

In Python

-9 // 5 -2

In C++

#include <iostream> int main() { int a = (-9 / 5); std::cout << a; }

prints -1. This is because -9 / 5 is -1.8, and floor division takes the floor of -1.8, which is -2, while truncation division truncates the fractional part of -1.8, giving -1.

I am confused -- I am not comparing these two operations head to head, I just make sure that the cpu version is consistent with the nvfuser version for both jitted prims.div and * // *.

You're correct, of course, and I just misread the code. You are comparing that the CPU and CUDA versions are consistent with each other. The fact that truncation division and floor division are distinct for integers is a separate issue, and doesn't impact this code. My mistake.

fyi @kevinstephano @jjsjann123

Aha! @IvanYashchuk told me that nvFuser's div is performing truncation division for integers now. I mistakenly thought it wasn't. That's why this test passes

while we are looking at torchex, I think the div prim impl has a typo (twice a instead of a and b) in the if condition:

lightning-thunder/thunder/executors/torchex.py

Line 908 in 44d2f3c

if dtypes.is_exact_dtype(to_dtype(a.dtype)) and dtypes.is_exact_dtype(to_dtype(a.dtype)):

while we are looking at torchex, I think the div prim impl has a typo (twice a instead of a and b) in the if condition:

lightning-thunder/thunder/executors/torchex.py

Line 908 in 44d2f3c

if dtypes.is_exact_dtype(to_dtype(a.dtype)) and dtypes.is_exact_dtype(to_dtype(a.dtype)):

I think you're right!

nikitaved requested review from mruberry, lantiga and t-vi as code owners July 22, 2024 12:11

nikitaved force-pushed the nikitaved/prims_div_nvfuser_fix branch from c68844e to 8e10ba4 Compare July 22, 2024 12:16

prims.div: equalize NVFuser prims with PyTorch

a4b5bf1

nikitaved force-pushed the nikitaved/prims_div_nvfuser_fix branch from 8e10ba4 to a4b5bf1 Compare July 22, 2024 12:30

nikitaved commented Jul 22, 2024

View reviewed changes

thunder/tests/test_nvfuser.py Show resolved Hide resolved

t-vi approved these changes Jul 22, 2024

View reviewed changes

t-vi enabled auto-merge (squash) July 22, 2024 12:49

t-vi merged commit 44d2f3c into main Jul 22, 2024
36 checks passed

t-vi deleted the nikitaved/prims_div_nvfuser_fix branch July 22, 2024 12:50

mruberry reviewed Jul 22, 2024

View reviewed changes

nikitaved changed the title ~~prims.div: equalize NVFuser prims with PyTorch~~ prims.div: equalize NVFuser prims with PyTorch for integer Tensors Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`prims.div`: equalize NVFuser prims with PyTorch for integer Tensors #821

`prims.div`: equalize NVFuser prims with PyTorch for integer Tensors #821

nikitaved commented Jul 22, 2024

nikitaved commented Jul 22, 2024

t-vi left a comment

mruberry Jul 22, 2024

nikitaved Jul 22, 2024 •

edited

Loading

nikitaved Jul 22, 2024

mruberry Jul 22, 2024

mruberry Jul 22, 2024

nikitaved Jul 22, 2024 •

edited

Loading

mruberry Jul 22, 2024

nikitaved Jul 22, 2024 •

edited

Loading

mruberry Jul 22, 2024 •

edited

Loading

mruberry Jul 22, 2024

mruberry Jul 22, 2024

t-vi Jul 22, 2024

mruberry Jul 22, 2024

prims.div: equalize NVFuser prims with PyTorch for integer Tensors #821

prims.div: equalize NVFuser prims with PyTorch for integer Tensors #821

Conversation

nikitaved commented Jul 22, 2024

nikitaved commented Jul 22, 2024

t-vi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nikitaved Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

mruberry Jul 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

`prims.div`: equalize NVFuser prims with PyTorch for integer Tensors #821

`prims.div`: equalize NVFuser prims with PyTorch for integer Tensors #821

nikitaved Jul 22, 2024 •

edited

Loading

nikitaved Jul 22, 2024 •

edited

Loading

nikitaved Jul 22, 2024 •

edited

Loading

mruberry Jul 22, 2024 •

edited

Loading