cat: support inputs with mixed dtypes #819

kshitij12345 · 2024-07-22T11:33:09Z

Fixes : #812

TODO - Look into test_vjp_correctness failure

kshitij12345 · 2024-07-23T09:51:33Z

test_vjp_correctness failure : check_vjp verifies the output against numerically computed Jacobian. However, with mixed inputs dtypes test (float and double) input, the numerical differentiation output is slightly different (for the float part of inputs) leading to mismatch. I would think that this is expected and we should probably just increase the tolerance for the test (or maybe if there is a way to increase tolerance only for this sample).

cc: @IvanYashchuk (original author of test)

Smaller Repro

from functools import partial
import torch
import thunder
from thunder.tests.make_tensor import make_tensor, make_tensor_like
from thunder.core.pytree import tree_map, tree_flatten
from thunder.tests.test_grad import numerical_jvp, _dot, vjp as thunder_vjp, Sequence, flatten_func, _make_differentiable_wrapper


# Copied from `test_grad` with minor changes to support torch eager
# and to create `u` and `v` as ones tensor (for easier reasoning).
def check_vjp(f, *primals, comp, eager=False):
    """Check that the vector-Jacobian product of a function is correct.

    Args:
        f (callable): The function to differentiate.
        *primals (torch.Tensor): The input tensors.
        executor (str): The executor to use. Defaults to "torch".
        atol (float): Absolute tolerance. Defaults to None.
        rtol (float): Relative tolerance. Defaults to None.

    Raises:
        AssertionError: If the vector-Jacobian product is not correct.
    """
    # Let f be a function from vectors of size n to vectors of size m.
    # Its Jacobian is a matrix J of size m x n.
    # The adjoint property is J^* J = I, where J^* is the conjugate transpose (adjoint) of J.
    # J^* is a matrix of size n x m.
    # For any vector v of size m, J^* v is a vector of size n.
    # For any vector u of size n, J u is a vector of size m.
    # The dot product of J^* v and u is the same as the dot product of v and J u.
    # This function checks that the dot product of J^* v and u is the same as the dot product of v and J u.
    # 〈J u, v〉 == 〈u, J* v〉
    # Since u and v can be arbitrary, we take u = rand_like(primals), and v = rand_like(f(primals)).
    # We compute J u using numerical_jvp, and J* v using Thunder's vjp. That way we check correctness of Thunder's vjp.
    # Using finite differences we can compute J u, but we can't compute J* v, without computing full J, which is expensive.

    # make = partial(make_tensor_like, low=0, high=1)
    make = partial(torch.ones_like)

    u = tree_map(make, primals)
    if eager:
      outs_p, J_u = numerical_jvp(f)(primals, u)
    else:
      outs_p, J_u = numerical_jvp(thunder.compile(f, disable_torch_autograd_support=True, disable_preprocessing=True))(primals, u)

    multiple_results = isinstance(outs_p, Sequence)

    v = tree_map(make, outs_p)
    if eager:
      _, J_star_v = torch.autograd.functional.vjp(f, primals, v)
    else:
      _, J_star_v = thunder.compile(thunder_vjp(f), disable_torch_autograd_support=True, disable_preprocessing=True)(primals, v)

    if not multiple_results:
        v = (v,)
        J_u = (J_u,)

    J_u_v = _dot(J_u, v)
    u_J_star_v = _dot(u, J_star_v)
    if J_u_v.isnan().any():
        # TODO: find a better way to handle NaNs in finite differences
        return  # skip this sample
    comp(J_u_v, u_J_star_v)


# Check thunder - torch.cat
f = thunder.torch.cat
primals = ((torch.ones(1, requires_grad=True), torch.ones(1, dtype=torch.double, requires_grad=True)),)
kwargs = {"dim": 0}
flat_op, flat_args, spec = flatten_func(f, primals, kwargs)

filtered_op, filtered_args = _make_differentiable_wrapper(flat_op, flat_args)
# AssertionError: Scalars are not close!

# Expected 2.0 but got 1.999905726350911.
# Absolute difference: 9.427364908898284e-05 (up to 1e-07 allowed)
# Relative difference: 4.713682454449142e-05 (up to 1e-07 allowed)
check_vjp(filtered_op, *filtered_args, comp=torch.testing.assert_close)

# Check eager - torch.cat
f = torch.cat
flat_op, flat_args, spec = flatten_func(f, primals, kwargs)
filtered_op, filtered_args = _make_differentiable_wrapper(flat_op, flat_args)

# AssertionError: Scalars are not close!

# Expected 2.0 but got 1.999905726350911.
# Absolute difference: 9.427364908898284e-05 (up to 1e-07 allowed)
# Relative difference: 4.713682454449142e-05 (up to 1e-07 allowed)
check_vjp(filtered_op, *filtered_args, comp=torch.testing.assert_close, eager=True)

t-vi

Looks great.

t-vi · 2024-07-23T12:24:32Z

Thank you @kshitij12345

cat: support inputs with mixed dtypes

510ed00

kshitij12345 changed the title ~~cat: support inputs with mixed dtypes~~ [WIP] cat: support inputs with mixed dtypes Jul 22, 2024

update comment

5c0b788

kshitij12345 added 2 commits July 23, 2024 13:46

update

1a4860a

update comment

2620479

kshitij12345 changed the title ~~[WIP] cat: support inputs with mixed dtypes~~ cat: support inputs with mixed dtypes Jul 23, 2024

kshitij12345 marked this pull request as ready for review July 23, 2024 12:03

kshitij12345 requested review from mruberry, lantiga and t-vi as code owners July 23, 2024 12:03

t-vi approved these changes Jul 23, 2024

View reviewed changes

t-vi merged commit fda3fce into Lightning-AI:main Jul 23, 2024
39 checks passed

github-actions bot deleted the cat-upcast branch October 23, 2024 00:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cat: support inputs with mixed dtypes #819

cat: support inputs with mixed dtypes #819

Uh oh!

kshitij12345 commented Jul 22, 2024

Uh oh!

kshitij12345 commented Jul 23, 2024

Uh oh!

t-vi left a comment

Uh oh!

t-vi commented Jul 23, 2024

Uh oh!

Uh oh!

Uh oh!

cat: support inputs with mixed dtypes #819

cat: support inputs with mixed dtypes #819

Uh oh!

Conversation

kshitij12345 commented Jul 22, 2024

Uh oh!

kshitij12345 commented Jul 23, 2024

Uh oh!

t-vi left a comment

Choose a reason for hiding this comment

Uh oh!

t-vi commented Jul 23, 2024

Uh oh!

Uh oh!

Uh oh!