Skip to content

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented Jul 28, 2025

  • Add compression info for nvfp4 to support decompression with CompressedLinear
  • Update to support batch_size > 1 when running QDQ for tensor_group / group dynamic activations - this is done by updatng the reshape command to be generic such that all dimensions are maintained apart from the last / group_dim which has to be reshaped for QDQ

@dsikka dsikka force-pushed the fix_nvfp4_decomp branch from dc36cfa to 30ad305 Compare July 30, 2025 14:13
@dsikka dsikka marked this pull request as ready for review July 30, 2025 15:00
kylesayrs
kylesayrs previously approved these changes Jul 30, 2025
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this doesn't support 4d activations, you should add asserts making sure that the ndims matches expectations

Copy link
Contributor

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also get in

another dynamic quant fix

rahul-tuli
rahul-tuli previously approved these changes Jul 31, 2025
Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to understand this first, sorry

@kylesayrs
Copy link
Contributor

kylesayrs commented Jul 31, 2025

In the future it might be nice to take a step back and make a decision about when tensors need to be reshaped during the qdq process.

Maybe rather than reshaping all of in compute_dynamic_scales_and_zp, _process_quantization, and dequantize, we can have this logic just exist once in _process_quantization.

The implementation below is what it might look like for both activations and weights. Some of this is wrong (it's late for me) but this function ensures the last dim is the granularity you want to quantize by.

def reshape_for_groups(func):
   def wrapper(x, args, ...):
       assert x.ndim >= 2

       if args.strategy == "token":
           pass
       if args.strategy == "channel":
           x = x.unsqueeze(-1)
       if args.strategy in ("group", "tensor_group"):
           num_groups = x.size(-1) // args.group_size
           x = x.unflatten(-1, (num_groups, args.group_size))
       if args.strategy == "block":
           block_height, block1_width = args.block_structure
           x = x.unfold(-2, block_height, block_height)  # [num_horiz, x.dim[-1], block_height]
           x = x.unfold(-2, block1_width, block1_width)  # [num_horiz, num_vert, block_height, block_width]
           x = flatten(-4, -3)  # [num_blocks, block_height x block_width]

       x = func(x, args, ...)

       if args.strategy == "token":
           pass
       if args.strategy == "channel":
           x = x.squeeze(-1)
       if args.strategy in ("group", "tensor_group"):
           return x.flatten(-2, -1)
       if args.strategy == "block":
           x = torch.cat(x, dim=-2)
           x = torch.cat(x, dim=-2)

   return wrapper

@reshape_for_groups
def _process_quantization(x, args, ...):
   if do_quantize:
        ...
   if do_dequantize:
        ...

Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@dsikka dsikka merged commit 46d84d8 into main Jul 31, 2025
1 check passed
@dsikka dsikka deleted the fix_nvfp4_decomp branch July 31, 2025 19:53
kylesayrs pushed a commit that referenced this pull request Aug 1, 2025
… Compression Params (#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update
dsikka added a commit that referenced this pull request Aug 11, 2025
* add utilities

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add additional tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add utils and tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Implement transform factories

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add permutations

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* key inverses by weight

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* standardize random hadamard

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* prepend input hooks

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply sqrt division first

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use divided hadamards

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add random option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use random seeds, rename matrix multiply

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add deterministic generation to random matrix

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix perm math

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup 2

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* make seed optional

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove iterable check and missing return value

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Remove unrelated changes

* simplify code

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* implement apply, use in tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use hadamards database file

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try manifest

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try setup, update hadamards list

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrings, cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup, thank you @dbarbuzzi

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove numpy, add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* solidify dtype, add gpu tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add device option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* save construction device changes for later

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

* cite nja sloane

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove dreg

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* put on device via safe_open

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* nits and docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Merge

* merge with construct: construct in float32

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct with same dtype, constructing on fp32 found no difference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unnecessary imports

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* bugfixes (#375)

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

* use factory_kwargs

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add frozen dict to deps

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrign

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use parametrize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* populate _dynamic_tied_weights_keys

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* ensure serializable

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove extra space

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge dregs

* skip offloading tests until transformers changes land

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use set

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params (#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update

* serialize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo, comment

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
brian-dellabetta added a commit that referenced this pull request Aug 12, 2025
* add utilities

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add additional tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add utils and tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Implement transform factories

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add permutations

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* key inverses by weight

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* standardize random hadamard

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* prepend input hooks

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply sqrt division first

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use divided hadamards

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add random option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use random seeds, rename matrix multiply

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add deterministic generation to random matrix

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix perm math

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup 2

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* make seed optional

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove iterable check and missing return value

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Remove unrelated changes

* simplify code

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* implement apply, use in tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use hadamards database file

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try manifest

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try setup, update hadamards list

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrings, cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup, thank you @dbarbuzzi

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove numpy, add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* solidify dtype, add gpu tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add device option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* save construction device changes for later

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

* cite nja sloane

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove dreg

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* put on device via safe_open

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* nits and docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Merge

* merge with construct: construct in float32

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct with same dtype, constructing on fp32 found no difference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unnecessary imports

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* bugfixes (#375)

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

* use factory_kwargs

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add frozen dict to deps

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrign

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use parametrize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* populate _dynamic_tied_weights_keys

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* ensure serializable

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove extra space

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge dregs

* skip offloading tests until transformers changes land

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use set

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params (#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update

* serialize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo, comment

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* include format

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025
… Compression Params (neuralmagic#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update
Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025
* add utilities

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add additional tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add utils and tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Implement transform factories

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add permutations

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* key inverses by weight

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* standardize random hadamard

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* prepend input hooks

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply sqrt division first

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use divided hadamards

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add random option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use random seeds, rename matrix multiply

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add deterministic generation to random matrix

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix perm math

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup 2

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* make seed optional

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove iterable check and missing return value

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Remove unrelated changes

* simplify code

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* implement apply, use in tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use hadamards database file

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try manifest

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try setup, update hadamards list

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrings, cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup, thank you @dbarbuzzi

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove numpy, add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* solidify dtype, add gpu tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add device option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* save construction device changes for later

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

* cite nja sloane

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove dreg

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* put on device via safe_open

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* nits and docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Merge

* merge with construct: construct in float32

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct with same dtype, constructing on fp32 found no difference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unnecessary imports

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* bugfixes (neuralmagic#375)

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

* use factory_kwargs

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add frozen dict to deps

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrign

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use parametrize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* populate _dynamic_tied_weights_keys

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* ensure serializable

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove extra space

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge dregs

* skip offloading tests until transformers changes land

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use set

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params (neuralmagic#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update

* serialize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo, comment

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
Etelis added a commit to Etelis/compressed-tensors that referenced this pull request Sep 11, 2025
* add utilities

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add additional tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add utils and tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Implement transform factories

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add permutations

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* key inverses by weight

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* standardize random hadamard

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* prepend input hooks

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply sqrt division first

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use divided hadamards

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add random option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use random seeds, rename matrix multiply

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add deterministic generation to random matrix

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix perm math

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* cleanup 2

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* make seed optional

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove iterable check and missing return value

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Remove unrelated changes

* simplify code

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* implement apply, use in tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use hadamards database file

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try manifest

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* try setup, update hadamards list

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrings, cleanup

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix setup, thank you @dbarbuzzi

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove numpy, add tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* solidify dtype, add gpu tests

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add device option

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* save construction device changes for later

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct on execution device, cache on offload device

* cite nja sloane

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove dreg

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* put on device via safe_open

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* nits and docstrings

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* update docstring

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* Merge

* merge with construct: construct in float32

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* construct with same dtype, constructing on fp32 found no difference

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove unnecessary imports

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* bugfixes (neuralmagic#375)

Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>

* use factory_kwargs

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add frozen dict to deps

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use delete_offload_module

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* add docstrign

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use parametrize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* populate _dynamic_tied_weights_keys

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* ensure serializable

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* remove extra space

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* apply style

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* merge dregs

* skip offloading tests until transformers changes land

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* use set

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* [Quantization][Decompression] Fix QDQ for dynamic quant; Update NVFP4 Compression Params (neuralmagic#407)

* add compression param; update qdq for batch greater than 1

* make generic

* fix tests

* remove incorrect line change; make generic

* update

* serialize

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* fix typo, comment

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

* include format

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>

---------

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Brian Dellabetta <bdellabe@redhat.com>
Co-authored-by: Kyle Sayers <kylesayrs@gmail.com>
Co-authored-by: Brian Dellabetta <brian-dellabetta@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants