-
Notifications
You must be signed in to change notification settings - Fork 678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LLVMGPU] Support linalg.pack through LLVMGPUTileAndFuse #20312
base: main
Are you sure you want to change the base?
Conversation
a32f2c5
to
d86123a
Compare
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
d86123a
to
889c820
Compare
Operation *op) { | ||
static bool elementHasPowerOfTwoBitwidth(Value operand) { | ||
Type elementType = getElementTypeOrSelf(operand.getType()); | ||
return isa<IntegerType, FloatType>(elementType) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use elementType.isIntOrFloat()
@@ -2549,10 +2500,6 @@ static LogicalResult setRootConfig(IREE::GPU::TargetAttr target, | |||
LDBG("Winograd Config"); | |||
return setWinogradOpConfig(target, entryPointFn, winogradOp); | |||
}) | |||
.Case<linalg::PackOp>([&](auto packOp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the entry of setting config on pack op now? I'm having hard time to figure it out when I look at the code around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It becomes part of the Default case now, and is handled by IREE::GPU::setTileAndFuseLoweringConfig
. The implementation lives in ConfigUtils.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I missed this. I don't get that why we have a flag on default, and the default is not the default anymore..
iree/compiler/src/iree/compiler/Codegen/LLVMGPU/KernelConfig.cpp
Lines 2571 to 2578 in ce2585a
if (!clLLVMGPUVectorizePipeline) { | |
if (succeeded(IREE::GPU::setTileAndFuseLoweringConfig( | |
target, entryPointFn, computeOp))) { | |
LDBG("Tile and fuse default config"); | |
return success(); | |
} | |
} | |
return setRootDefaultConfig(target, entryPointFn, computeOp); |
// We do not expect to find multiple pack/unpack ops in the same dispatch | ||
// region, so we can simply return the multiples for the given `packOp`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true? I thought that there are cases like unpack -> elementwise/reduction -> pack
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assumption is not about dispatch formation. It is about root op selection. If we have a dispatch like unpack -> elementwise -> pack
, then the elementwise op would become the root op, and we will not set a config on the pack/unpack ops.
I think I let the assumptions about the lowering config selection leak into the TileInferenceUtils, so it is quite confusing. I'll try to figure out a better way to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you meant. So it is more like we do not expect to find multiple pack/unpack ops when the root op is a pack op. What if we have unpack->pack
dispatch? What op is a root op?
@@ -560,16 +590,41 @@ LogicalResult setTileAndFuseLoweringConfig(IREE::GPU::TargetAttr target, | |||
return failure(); | |||
} | |||
|
|||
SmallVector<unsigned int> partitionableLoops; | |||
linalgOp.getParallelDims(partitionableLoops); | |||
// SmallVector<unsigned int> partitionableLoops; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: delete the comment
unsigned minBitwidth; | ||
unsigned representativeBitWidth; | ||
bool vectorizable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to initialize them in struct definition. Otherwise, it may lead to undefined behavior. E.g., the vectorizable
is not initialized in the below pack distribution config.
bool projPerm = | ||
llvm::all_of(linalgOp.getIndexingMapsArray(), | ||
[](AffineMap map) { return map.isProjectedPermutation(); }); | ||
bool powTwo = llvm::all_of(op->getOperands(), elementHasPowerOfTwoBitwidth); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional nit: I'd name them as isProjPerm
and isPowerOfTwo
.
Add lowering config selection logic for linalg.pack through LLVMGPUTileAndFuse, and remove the old LLVMGPUPackUnPack pipeline. Pack lowering configs are set similarly to other linalg ops going through TileAndFuse, because they are effectively just a linalg.transpose with a padded input.
fixes #20212