Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HIP version of asgard #400

Closed
wants to merge 36 commits into from
Closed

Conversation

ckendrick
Copy link
Collaborator

@ckendrick ckendrick commented Jul 23, 2021

Proposed changes

Note: Merge kronmult PR before this

This replaces CUDA calls in Asgard with HIP equivalents and adjusts CMake for building with HIP.

The HIP version of the underlying Kronmult library needs to be used to fully utilize HIP for AMD platforms. CMake 3.21 should allow for a lot of CMake simplifications.

Building on fusionmi50:

spack load hipblas@develop

cmake -DCMAKE_CXX_COMPILER=clang++ -DASGARD_USE_HIP=ON -DGPU_ARCH=906 -DBUILD_REPO_KRONMULT=https://github.com/ckendrick/kronmult.git -DBUILD_TAG_KRONMULT=9a8d70f -Dhip-lang_DIR=${HIP_PATH}/lib/cmake/hip-lang/ -DCMAKE_CXX_FLAGS=-I/usr/include/openblas/ ../

Building on fusiont5:

spack load cuda@10.2.89%gcc@7.4.0
export CUDA_PATH=${CUDA_HOME}
export HIP_PATH=/opt/rocm-4.2.0/hip/
export HIP_PLATFORM=nvidia
export HIP_COMPILER=nvcc
export HIP_RUNTIME=cuda

cmake -DCMAKE_CUDA_COMPILER=${CUDA_PATH}/bin/nvcc -DASGARD_USE_HIP=ON ../

Building on fusiont6:

spack load cuda@11.3.0
export CUDA_PATH=${CUDA_HOME}

cmake -DASGARD_USE_HIP=ON ../

What type(s) of changes does this code introduce?

Put an x in the boxes that apply.

  • Bugfix
  • New feature
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no api changes)
  • Build related changes
  • Documentation content changes
  • Other (please describe):

Does this introduce a breaking change?

  • Yes
  • No

What systems has this change been tested on?

fusiont5
fusiont6
fusionmi50
Ubuntu20.04 with Nvidia GPUs (CUDA 10.2 - 11.4)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.

  • this PR is up to date with current the current state of 'develop'
  • code added or changed in the PR has been clang-formatted
  • this PR adds tests to cover any new code, or to catch a bug that is being fixed
  • documentation has been added (if appropriate)

@quantumsteve
Copy link
Collaborator

@ckendrick What changes do you need to the build so they it'll run with -DASGARD_USE_HIP=ON?

@quantumsteve
Copy link
Collaborator

I tried following your directions on fusionmi50, but go the following error. Do I need to load a compiler first?

[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh
[svh@fusionmi50 ~]$ spack load --first hip@4.2.0
==> Error: No compilers for operating system centos7 satisfy spec gcc@7.3.1

@ckendrick
Copy link
Collaborator Author

I tried following your directions on fusionmi50, but go the following error. Do I need to load a compiler first?

[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh
[svh@fusionmi50 ~]$ spack load --first hip@4.2.0
==> Error: No compilers for operating system centos7 satisfy spec gcc@7.3.1

Try running spack compiler find which should automatically find the compilers installed. Usually you just have to do this once with a new spack setup. If it works, then you should have the following after running spack compiler list:

==> Available compilers
-- clang centos7-x86_64 -----------------------------------------
clang@12.0.0

-- gcc centos7-x86_64 -------------------------------------------
gcc@7.3.1  gcc@4.8.5

If not, then I can share the configuration I am using to put in ~/.spack/linux/compilers.yaml

@quantumsteve
Copy link
Collaborator

Looks like packages aren't being shared within our common spack setup 😕

[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh 
[svh@fusionmi50 ~]$  spack compiler list
==> Available compilers
-- gcc centos7-x86_64 -------------------------------------------
gcc@4.8.5

@ckendrick
Copy link
Collaborator Author

I think running source /opt/rh/devtoolset-7/enable before setting up the spack environment should get the newer version of gcc

@quantumsteve
Copy link
Collaborator

[svh@fusionmi50 ~]$ source /opt/rh/devtoolset-7/enable
[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh 
[svh@fusionmi50 ~]$ spack compiler list
==> Available compilers
-- gcc centos7-x86_64 -------------------------------------------
gcc@4.8.5

@ckendrick
Copy link
Collaborator Author

Sorry, I forgot to mention that spack compiler find should be re-run as well.

@quantumsteve
Copy link
Collaborator

I also needed to run spack load cmake@3.21.0

@quantumsteve
Copy link
Collaborator

The following tests FAILED:
	  1 - adapt-test (Subprocess aborted)
	  2 - basis-test (Subprocess aborted)
	  3 - batch-test (Subprocess aborted)
	  4 - boundary_conditions-test (Subprocess aborted)
	  5 - coefficients-test (Subprocess aborted)
	  6 - distribution-test (Subprocess aborted)
	  7 - elements-test (Subprocess aborted)
	  8 - fast_math-test (Subprocess aborted)
	  9 - kronmult-test (Subprocess aborted)
	 10 - lib_dispatch-test (Subprocess aborted)
	 11 - matlab_utilities-test (Subprocess aborted)
	 12 - pde-test (Subprocess aborted)
	 13 - permutations-test (Subprocess aborted)
	 15 - quadrature-test (Subprocess aborted)
	 16 - solver-test (Subprocess aborted)
	 17 - tensors-test (Subprocess aborted)
	 18 - time_advance-test (Subprocess aborted)
	 20 - transformations-test (Subprocess aborted)
	 21 - kronmult_cuda-test (Subprocess aborted)
[svh@fusionmi50 build]$ ctest -R adapt-test --verbose
UpdateCTestConfiguration  from :/home/svh/asgard/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/svh/asgard/build/DartConfiguration.tcl
Test project /home/svh/asgard/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: adapt-test

1: Test command: /home/svh/asgard/build/adapt-tests
1: Test timeout computed to be: 10000000
1: adapt-tests: /home/svh/asgard/src/lib_dispatch.cpp:39: device_handler::device_handler(): Assertion `success == HIPBLAS_STATUS_SUCCESS' failed.
1/1 Test #1: adapt-test .......................Subprocess aborted***Exception:   0.17 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.17 sec

The following tests FAILED:
	  1 - adapt-test (Subprocess aborted)
Errors while running CTest
Output from these tests are in: /home/svh/asgard/build/Testing/Temporary/LastTest.log

@ckendrick
Copy link
Collaborator Author

rocm-device-libs should get loaded automatically when loading hip, you can use spack find --loaded to verify it is listed.

This seems to be a path issue, probably due to the abnormal directory structure from splitting rocm/hip into spack packages (instead of having everything installed to /opt/rocm). Setting the following environment variables should fix the rocm device lib path message. You might have to also purge the build directory and re-run cmake.

export HIP_DEVICE_LIB_PATH=/opt/spack/opt/spack/linux-centos7-zen/gcc-7.3.1/rocm-device-libs-4.2.0-elwhgtyne5wgof6m6mwrlconzda6epvi/amdgcn/bitcode
export DEVICE_LIB_PATH=${HIP_DEVICE_LIB_PATH}

@ckendrick
Copy link
Collaborator Author

Can you run rocminfo and hipconfig without any error? You may need to be added to the video group.

@ckendrick ckendrick force-pushed the feature/hip branch 2 times, most recently from 34c65d6 to 2de03a3 Compare October 12, 2021 16:15
Copy link
Collaborator

@quantumsteve quantumsteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple thoughts while reviewing these changes

CMakeLists.txt Outdated
message(STATUS "HIP Libraries: ${HIP_LIBRARIES}")

if(ASGARD_PLATFORM_NVCC)
find_package(CUDA 9.0 REQUIRED)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we eliminate the old find_package(CUDA)?

CMakeLists.txt Outdated
include_directories(SYSTEM ${HIP_INCLUDE_DIRS})
# assume this include path since HIP_INCLUDE_DIRS is not being set on nvidia platform
include_directories(SYSTEM "${HIP_PATH}/include")
include_directories(${HIPBLAS_INCLUDE_DIRS})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use target_include_directories?


# set source file language properties
if(ASGARD_PLATFORM_AMD)
#set_source_files_properties( src/device/kronmult_cuda.cpp PROPERTIES LANGUAGE HIP ) # should work after cmake 3.21 release?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this now that we require CMake 3.21?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe this should work now but I am not able to test it at the moment since the AMD machine is still down.

P tol_factor = 1e-17;
if constexpr (resrc == resource::device)
{
tol_factor = 1e-7;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮

CMakeLists.txt Outdated
if(ASGARD_PLATFORM_AMD)
target_link_libraries(tensors PRIVATE hip::device)
elseif(ASGARD_PLATFORM_NVCC)
target_link_libraries(tensors PRIVATE ${CUDA_LIBRARIES})
Copy link
Collaborator

@quantumsteve quantumsteve Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does HIP not take care of linking against CUDA?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to get it to work automatically, but I might be missing something. enable_language(HIP) seems to cause issues on Nvidia. The closest I've gotten is on the kronmult PR, but that was setting the language to CUDA for each target which may not be the best solution.
The new changes I made is using hip_add_library and hip_add_executable (which may be worse than before?) but those still seem to be missing linking in the CUDA libraries.

CMakeLists.txt Outdated
if(ASGARD_PLATFORM_AMD)
target_link_libraries(lib_dispatch PRIVATE hip::device)
elseif(ASGARD_PLATFORM_NVCC)
target_link_libraries(lib_dispatch PRIVATE ${CUDA_LIBRARIES})
Copy link
Collaborator

@quantumsteve quantumsteve Nov 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does HIP not take care of linking against CUDA?

@@ -78,7 +81,7 @@ option (ASGARD_PROFILE_PERF "enable profiling support for using linux perf" "")
option (ASGARD_PROFILE_VALGRIND "enable profiling support for using valgrind" "")
option (ASGARD_GRAPHVIZ_PATH "optional location of bin/ containing dot executable" "")
option (ASGARD_IO_HIGHFIVE "Use the HighFive HDF5 header library for I/O" OFF)
option (ASGARD_USE_CUDA "Optional CUDA support for asgard" OFF)
option (ASGARD_USE_HIP "Optional HIP support for asgard" OFF)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using CUDA through the HIP API is not a great idea at the moment. The biggest issue comes from the use of math-libraries such as cuBlas and rocBlas, where they don't fully mirror or port the capabilities (especially true for sparse calls). Lesser problems (but problems never the less) come from availability and support across platforms, Nvidia based systems do not have universal support for HIP, also optimizations and performance.

HIP and CUDA can sit side by side in the code and have only one flipped on/off. All we need (usually) is to change the abstraction of memory allocation and data movement, as well as the kernels which seldom require any change, i.e., we can use the same kernels, just compile them differently.

@quantumsteve quantumsteve marked this pull request as draft January 6, 2023 15:06
@quantumsteve
Copy link
Collaborator

@ckendrick @mkstoyanov I recommend closing this as we're unlikely to continue using the earlier kronmult implementation.

@mkstoyanov mkstoyanov closed this Apr 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants