-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Mosaic GPU] Call mbarrier.try_wait only once mbarrier.test_wait fails
The llvm.expect intrinsic puts the loop at the end of the program, allowing the whole barrier to be compiled to a test_wait + predicated branch that is immediately followed by the continuation. This seems to make the happy path a little faster which can help reduce the barrier overhead for compute-bound kernels. PiperOrigin-RevId: 645007019
- Loading branch information
Showing
1 changed file
with
43 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters