[Quantizer] PerTensorAffineQuantizer operations #2828

djeong20 · 2024-12-16T07:55:54Z

This PR adds initial PerTensorAffineQuantizer operation implementations.
This change allows users to quantize and dequantize tensors.
Note that the current implementation is naive and has limited features.
The optimized version will be introduced in the later PR.

Self-evaluation:

Build test: [X]Passed [ ]Failed [ ]Skipped
Run test: [X]Passed [ ]Failed [ ]Skipped

skykongkong8

I really appreciate your current initiative work for new quantizer class in the nntrainer!
Additionally, I want to come up with a little bit of idea to share-
It is possible that in the future, a quantization scheme such as follows may be required:

// Types.hpp from mllm
...
typedef struct {
    float d;                  // delta
    int8_t qs[QK_K];          // quants
    int16_t bsums[QK_K / 16]; // sum of quants in groups of 16
} block_q8_K;

This is because at the computational kernel level, having such additional information helps to reduce its cost a lot!
Although this is not an urgent thing to consider, but I just wanted to inform you about this to avoid potential workload in the future.

djeong20 · 2024-12-18T00:24:24Z

I really appreciate your current initiative work for new quantizer class in the nntrainer! Additionally, I want to come up with a little bit of idea to share- It is possible that in the future, a quantization scheme such as follows may be required:
// Types.hpp from mllm
...
typedef struct {
    float d;                  // delta
    int8_t qs[QK_K];          // quants
    int16_t bsums[QK_K / 16]; // sum of quants in groups of 16
} block_q8_K;
This is because at the computational kernel level, having such additional information helps to reduce its cost a lot! Although this is not an urgent thing to consider, but I just wanted to inform you about this to avoid potential workload in the future.

Thank you for sharing the information! There are various quantization parameters for different quantization schemes.
I wanna make it clear that we are on the same page!
Do you mean the current quantization schemes can also have the parameters (delta, quants, sum of quants) or there is another quantization scheme that has such q params?

skykongkong8 · 2024-12-19T01:15:27Z

Thank you for sharing the information! There are various quantization parameters for different quantization schemes. I wanna make it clear that we are on the same page! Do you mean the current quantization schemes can also have the parameters (delta, quants, sum of quants) or there is another quantization scheme that has such q params?

Yes, I mean even with the same current quantization algorithm (uniform quantization, etc) qParam might be better to hold such additional information. Of course, there may be cases where this is not necessary, but at least in the case of matrix multiplication operations, it can be more effective to have such information for algorithm implementation. In a nutshell, even current algorithms sometimes require additional information like that, and sometimes they don't, so considering that there might be separate modifications in the future... It's just to inform you to prevent additional fixes in the future 😅

This PR adds initial PerTensorAffineQuantizer operation implementations. This change allows users to quantize and dequantize tensors. Note that the current implementation is naive and has limited features. The optimized version will be introduced in the later PR. **Self-evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: Donghyeon Jeong <dhyeon.jeong@samsung.com>

DonghakPark

Great Work!!!
LGTM!

baek2sm · 2024-12-27T04:58:18Z

nntrainer/tensor/quantizer.cpp

+ * @param val value to clip
+ * @param lower lower bound
+ * @param upper upper bound
+ * @return T cliped data


baek2sm

Nice work. LGTM!

EunjuYang

LGTM

jijoongmoon

LGTM.

djeong20 requested review from myungjoo, jijoongmoon, again4you, jaeyun-jung, leemgs, wooksong, gichan-jang, anyj0527, lhs8928, songgot, jihochu, DonghakPark, SeoHyungjun, baek2sm, skykongkong8 and EunjuYang as code owners December 16, 2024 07:55

github-actions bot added the Need Review label Dec 16, 2024

skykongkong8 reviewed Dec 16, 2024

View reviewed changes

djeong20 force-pushed the quantizer/per_tensor_affine/impl_v1 branch from 16764fc to 0b49cd0 Compare December 16, 2024 08:12

djeong20 force-pushed the quantizer/per_tensor_affine/impl_v1 branch 2 times, most recently from d104aac to 400105e Compare December 18, 2024 00:53

skykongkong8 approved these changes Dec 20, 2024

View reviewed changes

djeong20 force-pushed the quantizer/per_tensor_affine/impl_v1 branch from 400105e to 9332b41 Compare December 23, 2024 00:32

djeong20 changed the title ~~[Wait for #2824][Quantizer] PerTensorAffineQuantizer operations~~ [Quantizer] PerTensorAffineQuantizer operations Dec 23, 2024

DonghakPark approved these changes Dec 27, 2024

View reviewed changes

baek2sm reviewed Dec 27, 2024

View reviewed changes

baek2sm approved these changes Dec 27, 2024

View reviewed changes

github-actions bot added PR/READY2MERGE and removed Need Review labels Dec 27, 2024

EunjuYang approved these changes Dec 27, 2024

View reviewed changes

jijoongmoon approved these changes Jan 2, 2025

View reviewed changes

jijoongmoon merged commit db0ed5d into nnstreamer:main Jan 2, 2025
27 checks passed

djeong20 deleted the quantizer/per_tensor_affine/impl_v1 branch January 9, 2025 09:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Quantizer] PerTensorAffineQuantizer operations #2828

[Quantizer] PerTensorAffineQuantizer operations #2828

djeong20 commented Dec 16, 2024

skykongkong8 left a comment

djeong20 commented Dec 18, 2024

skykongkong8 commented Dec 19, 2024

DonghakPark left a comment

baek2sm Dec 27, 2024

baek2sm left a comment

EunjuYang left a comment

jijoongmoon left a comment

[Quantizer] PerTensorAffineQuantizer operations #2828

[Quantizer] PerTensorAffineQuantizer operations #2828

Conversation

djeong20 commented Dec 16, 2024

skykongkong8 left a comment

Choose a reason for hiding this comment

djeong20 commented Dec 18, 2024

skykongkong8 commented Dec 19, 2024

DonghakPark left a comment

Choose a reason for hiding this comment

baek2sm Dec 27, 2024

Choose a reason for hiding this comment

baek2sm left a comment

Choose a reason for hiding this comment

EunjuYang left a comment

Choose a reason for hiding this comment

jijoongmoon left a comment

Choose a reason for hiding this comment