-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Level Zero Device #483
Closed
Closed
Level Zero Device #483
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
common_gpu Add a new level_zero device (WIP) - copy device_cuda in device_level_zero and rename things - module_init and module_fini for level_zero Need to factorize a little bit more. Factorizing (need to do it in base) Port above new common
- Add multiple CMake logic files and commands - jdf2c.c now generates dpcpp output files when needed - make DEV_DPCPP be an alias to DEV_LEVEL_ZERO - Command Lists for I/O (streams of id 0 and 1) are still immediate - Command Lists for computations (streams of id >= 2) are now normal lists connected to a queue that queue exists as a compute level-zero queue and as a DPC++ queue - Missing compilation logic to compile generated dpc++ code and link it with the target binary Risk: it is unclear that the user can still push orders / events in the command list, after it is closed, and it is necessary to close it to force the orders to be pushed on the queue. I might need to create a new command list after each close, and attach the command list to the event for garbage collection. Adapt findlevel-zero.cmake to support systems where pkg-config is broken
…el Zero update use_cuda / use_cuda_index have been renamed to follow proper naming scheme; do the same for level_zero
…ry allocation request in wrapper.
…eated immediate (and they cannot be immediate if we want to get their Command Queue, which is necessary for the DPC++ interface) Typo and multiple CMake fixes to make CMake link with DPCPP generated files
Buffer interface is not required. We can use the USM OneMKL interface, it seems to work ok. Need to check for performance. We cannot mix immediate and non-immediate command lists apparently. Or at least it makes the passing of command queues unreliable There is an exception in data.c how we handle GPU copies, it must be ported to Level Zero too. The Level Zero runtime has a atexit procedure to delete command queues, and this seems to conflict with our own actions to delete the command queues...
NULL is not a valid MPI datatype when compiling with a clone of MPICH. The value doesn't matter in this case, just cast
Some fixes in device level_zero Temp fix for termination detection -- tag size must be made portable. TODO!
Fix the subsystem test. Need to backport fixes in the MCA device Fully functional sketch for level zero
…, because command lists (or work) submitted to the command queues by SYCL (typically oneMKL) can complete in parallel with events belonging to other command lists.
…avoid polluting their namespace; cleanup some unused variables
…e LevelZero library as at compile time in PaRSECConfig.cmake
Superseded by #486 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This branch implements a Level Zero Device to support Intel (and other OneAPI) GPUs.