-
Notifications
You must be signed in to change notification settings - Fork 540
[Core] Fix inconsistent logic in C++ tensor class #2330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Still seeing correctness issues. Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
|
/te-ci |
for more information, see https://pre-commit.ci
Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>
for more information, see https://pre-commit.ci
|
/te-ci L1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Refactored C++ Tensor class to fix inconsistent initialization and data checking logic by using shape=[0] for uninitialized tensors instead of empty shapes.
Key Changes:
- Updated
SimpleTensorandTensordefault constructors to initialize withshape=[0]instead ofshape=[], fixing semantic confusion between 0-D tensors (which have 1 entry) and uninitialized tensors - Refactored
has_data()andhas_columnwise_data()to consistently check both pointer validity and non-zero element count, making them reliable indicators of "is it safe to touch the data pointer?" - Updated
shape()anddim()methods to use explicitis_shape_trivial()checks instead of relying onhas_data()/has_columnwise_data(), fixing cases where data is initialized but pointer shouldn't be accessed (e.g., zero-dim tensors) - Propagated changes throughout normalization, swizzle, transpose, and PyTorch integration code to use
numel() != 0checks instead ofshape.empty() - Added defensive null checks in GEMM code to prevent out-of-bounds access with empty tensors
- Fixed workspace size queries to return
{1}instead of{0}to distinguish from uninitialized state
Issues Found:
- Two error messages in
swizzle.cuincorrectly referenceinput->scale_inv.shapeinstead ofinput->columnwise_scale_inv.shape(lines 378 and 392)
Confidence Score: 4/5
- This PR is safe to merge after fixing the two error message bugs in swizzle.cu
- The refactoring improves code correctness by fixing semantic confusion between 0-D and uninitialized tensors. The changes are well-tested and systematic. Score is 4 instead of 5 due to: (1) two copy-paste errors in error messages that would display incorrect variable names during debugging, and (2) the extensive nature of the refactoring across 11 files increases risk of edge cases
- Pay close attention to transformer_engine/common/swizzle/swizzle.cu which has two error message bugs on lines 378 and 392
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| transformer_engine/common/common.h | 4/5 | Updated Tensor class to use shape=[0] for uninitialized tensors instead of shape=[]; refactored has_data() and has_columnwise_data() to check both pointer and numel(); improved shape() method logic with consistent is_shape_trivial checks |
| transformer_engine/common/include/transformer_engine/transformer_engine.h | 5/5 | Updated TensorWrapper to return shape=[0] for null tensors instead of shape=[]; added emptyShape constant; improved constructor logic to set appropriate shapes for null pointers |
| transformer_engine/common/transformer_engine.cpp | 5/5 | Updated CheckScaleTensorShape to distinguish between FP8 and high-precision tensors; changed nvte_get_tensor_param to return shape=[0] for null tensors |
| transformer_engine/common/swizzle/swizzle.cu | 3/5 | Refactored swizzle_scaling_factors to check pointer and numel() instead of has_data(); improved logic with explicit has_rowwise_scale_inv/has_columnwise_scale_inv checks. Found two error messages that incorrectly reference scale_inv.shape instead of columnwise_scale_inv.shape |
Sequence Diagram
sequenceDiagram
participant App as Application Code
participant Tensor as Tensor Class
participant SimpleTensor as SimpleTensor
participant Wrapper as TensorWrapper
Note over App,Wrapper: Uninitialized Tensor Creation
App->>Tensor: Tensor()
Tensor->>SimpleTensor: SimpleTensor()
SimpleTensor->>SimpleTensor: Set shape=[0], dptr=nullptr
SimpleTensor-->>Tensor: Uninitialized SimpleTensor
Note over Tensor: All scale tensors initialized with shape=[0]
Tensor-->>App: Uninitialized Tensor
Note over App,Wrapper: Checking Tensor State
App->>Tensor: has_data()
Tensor->>Tensor: Check: dptr != nullptr && numel() != 0
Tensor-->>App: false (uninitialized)
Note over App,Wrapper: After Data Assignment
App->>Tensor: Set data.dptr, data.shape
App->>Tensor: has_data()
Tensor->>SimpleTensor: numel()
SimpleTensor->>SimpleTensor: Calculate product of shape dims
SimpleTensor-->>Tensor: numel > 0
Tensor->>Tensor: Check: dptr != nullptr && numel() != 0
Tensor-->>App: true (has valid data)
Note over App,Wrapper: TensorWrapper Integration
App->>Wrapper: TensorWrapper(dptr, shape, dtype)
Wrapper->>Wrapper: Check if amax_dptr == nullptr
alt amax_dptr is null
Wrapper->>Wrapper: Use emptyShape {0}
else amax_dptr is valid
Wrapper->>Wrapper: Use defaultShape {1}
end
Wrapper-->>App: Configured TensorWrapper
11 files reviewed, 2 comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR fixes inconsistent logic in the C++ Tensor class by standardizing how uninitialized and empty tensors are handled.
Key Changes:
- Uninitialized tensors now constructed with
shape=[0]instead ofshape=[](0-D tensors should have 1 entry) Tensor::has_data()andTensor::has_columnwise_data()now consistently check both pointer validity AND non-zero element count- Swizzle functions updated to explicitly check pointer and size instead of relying on
has_data()/has_columnwise_data(), which allows proper handling of edge cases where tensors have valid pointers but zero dimensions
Impact:
- Fixes potential segfaults when accessing data pointers of empty tensors
- Ensures consistent behavior across the codebase when checking if tensor data can be safely accessed
- Previous comment noted error messages referenced wrong variable (
scale_inv.shapeinstead ofcolumnwise_scale_inv.shape), which was already corrected in lines 378 and 393
Confidence Score: 5/5
- This PR is safe to merge - it's a well-structured refactoring that fixes edge case handling
- The changes are defensive and fix potential bugs with empty tensor handling. The refactoring replaces implicit checks with explicit pointer and size validation, making the code more robust and easier to reason about. The logic is consistent throughout and follows a clear pattern.
- No files require special attention
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| transformer_engine/common/swizzle/swizzle.cu | 5/5 | Replaces inappropriate uses of has_data()/has_columnwise_data() with explicit pointer and size checks to handle empty tensors correctly |
Sequence Diagram
sequenceDiagram
participant Caller
participant swizzle_scaling_factors
participant Tensor
participant CUDA_Kernel
Caller->>swizzle_scaling_factors: input, output, stream
swizzle_scaling_factors->>swizzle_scaling_factors: Check scaling mode
swizzle_scaling_factors->>Tensor: CheckInputTensor(input)
swizzle_scaling_factors->>Tensor: CheckInputTensor(output)
Note over swizzle_scaling_factors: Check if scaling factors exist
swizzle_scaling_factors->>Tensor: scale_inv.dptr != nullptr?
swizzle_scaling_factors->>Tensor: scale_inv.numel() != 0?
swizzle_scaling_factors->>Tensor: columnwise_scale_inv.dptr != nullptr?
swizzle_scaling_factors->>Tensor: columnwise_scale_inv.numel() != 0?
alt No scaling factors
swizzle_scaling_factors-->>Caller: return early
end
Note over swizzle_scaling_factors: Deduce tensor dimensions from shape
swizzle_scaling_factors->>Tensor: Get scale_inv.shape or columnwise_scale_inv.shape
Note over swizzle_scaling_factors: Validate dimensions
swizzle_scaling_factors->>swizzle_scaling_factors: Check m % SF_TILE_DIM_M == 0
swizzle_scaling_factors->>swizzle_scaling_factors: Check k % SF_TILE_DIM_K == 0
Note over swizzle_scaling_factors: Choose swizzle implementation
alt Rowwise swizzle needed
swizzle_scaling_factors->>CUDA_Kernel: swizzle_row_scaling_kernel
CUDA_Kernel-->>swizzle_scaling_factors: Complete
end
alt Columnwise swizzle needed
swizzle_scaling_factors->>CUDA_Kernel: swizzle_col_scaling_kernel
CUDA_Kernel-->>swizzle_scaling_factors: Complete
end
swizzle_scaling_factors-->>Caller: return
1 file reviewed, no comments
Description
This PR fixes some hacky logic in the C++
Tensorclass:shape=[0]. Previously we constructed them as 0-D tensors, which should have one entry (see [PyTorch][MOE] Support NVFP4 Grouped Linear #2215 (comment)).Tensor::has_dataandTensor::has_columnwise_datahave a consistent meaning: "is it safe to touch the data pointer?".Tensor::has_data/Tensor::has_columnwise_datainappropriately (Tensor::shape, swizzling). These are cases where the data is initialized but the pointer is not safe to touch, e.g. when the tensor has a zero dim. Note that I've only fixed places that were causing problems, and I haven't fully gone through the entire code base.Type of change
Changes
Tensor::has_dataandTensor::has_columnwise_data.Tensor::has_data/Tensor::has_columnwise_datafromTensor::shapeand swizzling.Checklist: