-
Notifications
You must be signed in to change notification settings - Fork 682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Codegen] Keep lowering config when decomposing linalg.pack #20311
[Codegen] Keep lowering config when decomposing linalg.pack #20311
Conversation
4762d88
to
e8ce777
Compare
This is not true to me because the iteration space of a pack op is about non-packing domain. I.e., the number of iteration dimension is as the same as the rank of the source. It might be true on GPU path, but it is definitely not true on all the CPU targets. I've been working hard to get rid of the pass on CPU codegen and the transpose op is a compute op to me, so it might be fine. Please also update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually on first glance, looked right to me. But will wait for discussion to resolve.
I see, thanks for pointing it out. I didn't realize the iteration space rank was matching that of the source. I do still think the logic holds though, because the iteration domain bounds are still taken from the result shape of the pack, which matches the transpose bounds. Afaik, the TilingInterface will automatically truncate or fill with zeros to the correct iteration domain. Any tiling configuration that was valid for the pack should also be valid for the transpose (but not the other way around), so I think it is okay to pass the lowering config to the transpose op. It also allows us to tile the inner tiles of the transpose after decomposition without having to pick a new config, so it is helpful to be able to keep the config. We actually do something similar already with IGEMM, where we set the lowering config based on the matmul that is expected to come out of the ConvToIGEMM transformation. It is possibly a little brittle to be setting lowering configs with the expectation that they should apply to the decomposed op, but we control pack and unpack decomposition fairly well (only happening in this one pass), so I think it is okay. The alternative solution would be to set an EDIT: I think this is where tile sizes get resized to match the iteration space rank: EDIT 2: My second PR: #20312 takes advantage of setting tile sizes on the inner tiles, as an example. |
It's taking me a while to get used to this renaming lol, I got so used to tensor.pack :p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also allows us to tile the inner tiles of the transpose after decomposition without having to pick a new config, so it is helpful to be able to keep the config.
Good point to the tiling sizes for inner tiles. I can see the value because we are swizzling inner tiles on GPU.
The alternative solution would be to set an iree_gpu.derived_thread_config on the transpose after/during decomposition, but that honestly feels more fragile to me.
I don't like iree_gpu.derived_thread_config
approach because it hides the strategy. I can't see the tile sizes from IR while it should be simple in this case.
compiler/src/iree/compiler/Codegen/Common/test/decompose_pack_unpack_ops.mlir
Show resolved
Hide resolved
Unblocking, will come back to it later.
@Max191 please update the PR description because the iteration space statement is not correct. I think our new understanding is that it allows us to describe tiling spec of inner tiles for further lowering. |
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
e8ce777
to
df969ff
Compare
When decomposing linalg.pack ops into
tensor.pad->tensor.expand_shape->linalg.transpose
, apply the lowering_config of the linalg.pack onto the linalg.transpose. This allows further tiling within the inner tiles of the pack after decomposition by setting lowering configs that have tile sizes for the inner tile dimensions of the operation.