-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIP version of asgard #400
Conversation
@ckendrick What changes do you need to the build so they it'll run with |
I tried following your directions on fusionmi50, but go the following error. Do I need to load a compiler first?
|
Try running
If not, then I can share the configuration I am using to put in |
Looks like packages aren't being shared within our common spack setup 😕
|
I think running |
|
Sorry, I forgot to mention that |
I also needed to run |
|
This seems to be a path issue, probably due to the abnormal directory structure from splitting rocm/hip into spack packages (instead of having everything installed to /opt/rocm). Setting the following environment variables should fix the rocm device lib path message. You might have to also purge the build directory and re-run cmake.
|
Can you run |
34c65d6
to
2de03a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple thoughts while reviewing these changes
CMakeLists.txt
Outdated
message(STATUS "HIP Libraries: ${HIP_LIBRARIES}") | ||
|
||
if(ASGARD_PLATFORM_NVCC) | ||
find_package(CUDA 9.0 REQUIRED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we eliminate the old find_package(CUDA)
?
CMakeLists.txt
Outdated
include_directories(SYSTEM ${HIP_INCLUDE_DIRS}) | ||
# assume this include path since HIP_INCLUDE_DIRS is not being set on nvidia platform | ||
include_directories(SYSTEM "${HIP_PATH}/include") | ||
include_directories(${HIPBLAS_INCLUDE_DIRS}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use target_include_directories
?
|
||
# set source file language properties | ||
if(ASGARD_PLATFORM_AMD) | ||
#set_source_files_properties( src/device/kronmult_cuda.cpp PROPERTIES LANGUAGE HIP ) # should work after cmake 3.21 release? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use this now that we require CMake 3.21?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe this should work now but I am not able to test it at the moment since the AMD machine is still down.
P tol_factor = 1e-17; | ||
if constexpr (resrc == resource::device) | ||
{ | ||
tol_factor = 1e-7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮
CMakeLists.txt
Outdated
if(ASGARD_PLATFORM_AMD) | ||
target_link_libraries(tensors PRIVATE hip::device) | ||
elseif(ASGARD_PLATFORM_NVCC) | ||
target_link_libraries(tensors PRIVATE ${CUDA_LIBRARIES}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does HIP not take care of linking against CUDA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to get it to work automatically, but I might be missing something. enable_language(HIP)
seems to cause issues on Nvidia. The closest I've gotten is on the kronmult PR, but that was setting the language to CUDA for each target which may not be the best solution.
The new changes I made is using hip_add_library
and hip_add_executable
(which may be worse than before?) but those still seem to be missing linking in the CUDA libraries.
CMakeLists.txt
Outdated
if(ASGARD_PLATFORM_AMD) | ||
target_link_libraries(lib_dispatch PRIVATE hip::device) | ||
elseif(ASGARD_PLATFORM_NVCC) | ||
target_link_libraries(lib_dispatch PRIVATE ${CUDA_LIBRARIES}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does HIP not take care of linking against CUDA?
@@ -78,7 +81,7 @@ option (ASGARD_PROFILE_PERF "enable profiling support for using linux perf" "") | |||
option (ASGARD_PROFILE_VALGRIND "enable profiling support for using valgrind" "") | |||
option (ASGARD_GRAPHVIZ_PATH "optional location of bin/ containing dot executable" "") | |||
option (ASGARD_IO_HIGHFIVE "Use the HighFive HDF5 header library for I/O" OFF) | |||
option (ASGARD_USE_CUDA "Optional CUDA support for asgard" OFF) | |||
option (ASGARD_USE_HIP "Optional HIP support for asgard" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using CUDA through the HIP API is not a great idea at the moment. The biggest issue comes from the use of math-libraries such as cuBlas and rocBlas, where they don't fully mirror or port the capabilities (especially true for sparse calls). Lesser problems (but problems never the less) come from availability and support across platforms, Nvidia based systems do not have universal support for HIP, also optimizations and performance.
HIP and CUDA can sit side by side in the code and have only one flipped on/off. All we need (usually) is to change the abstraction of memory allocation and data movement, as well as the kernels which seldom require any change, i.e., we can use the same kernels, just compile them differently.
@ckendrick @mkstoyanov I recommend closing this as we're unlikely to continue using the earlier kronmult implementation. |
Proposed changes
Note: Merge kronmult PR before this
This replaces CUDA calls in Asgard with HIP equivalents and adjusts CMake for building with HIP.
The HIP version of the underlying Kronmult library needs to be used to fully utilize HIP for AMD platforms. CMake 3.21 should allow for a lot of CMake simplifications.
Building on
fusionmi50
:Building on
fusiont5
:Building on
fusiont6
:What type(s) of changes does this code introduce?
Put an
x
in the boxes that apply.Does this introduce a breaking change?
What systems has this change been tested on?
fusiont5
fusiont6
fusionmi50
Ubuntu20.04 with Nvidia GPUs (CUDA 10.2 - 11.4)
Checklist
Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.