HIP version of asgard #400

ckendrick · 2021-07-23T04:23:24Z

Proposed changes

Note: Merge kronmult PR before this

This replaces CUDA calls in Asgard with HIP equivalents and adjusts CMake for building with HIP.

The HIP version of the underlying Kronmult library needs to be used to fully utilize HIP for AMD platforms. CMake 3.21 should allow for a lot of CMake simplifications.

Building on fusionmi50:

spack load hipblas@develop

cmake -DCMAKE_CXX_COMPILER=clang++ -DASGARD_USE_HIP=ON -DGPU_ARCH=906 -DBUILD_REPO_KRONMULT=https://github.com/ckendrick/kronmult.git -DBUILD_TAG_KRONMULT=9a8d70f -Dhip-lang_DIR=${HIP_PATH}/lib/cmake/hip-lang/ -DCMAKE_CXX_FLAGS=-I/usr/include/openblas/ ../

Building on fusiont5:

spack load cuda@10.2.89%gcc@7.4.0
export CUDA_PATH=${CUDA_HOME}
export HIP_PATH=/opt/rocm-4.2.0/hip/
export HIP_PLATFORM=nvidia
export HIP_COMPILER=nvcc
export HIP_RUNTIME=cuda

cmake -DCMAKE_CUDA_COMPILER=${CUDA_PATH}/bin/nvcc -DASGARD_USE_HIP=ON ../

Building on fusiont6:

spack load cuda@11.3.0
export CUDA_PATH=${CUDA_HOME}

cmake -DASGARD_USE_HIP=ON ../

What type(s) of changes does this code introduce?

Put an x in the boxes that apply.

Does this introduce a breaking change?

Yes
No

What systems has this change been tested on?

fusiont5
fusiont6
fusionmi50
Ubuntu20.04 with Nvidia GPUs (CUDA 10.2 - 11.4)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.

this PR is up to date with current the current state of 'develop'
code added or changed in the PR has been clang-formatted
this PR adds tests to cover any new code, or to catch a bug that is being fixed
documentation has been added (if appropriate)

quantumsteve · 2021-08-31T21:25:02Z

@ckendrick What changes do you need to the build so they it'll run with -DASGARD_USE_HIP=ON?

quantumsteve · 2021-08-31T21:50:33Z

I tried following your directions on fusionmi50, but go the following error. Do I need to load a compiler first?

[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh
[svh@fusionmi50 ~]$ spack load --first hip@4.2.0
==> Error: No compilers for operating system centos7 satisfy spec gcc@7.3.1

ckendrick · 2021-08-31T22:18:49Z

I tried following your directions on fusionmi50, but go the following error. Do I need to load a compiler first?
[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh
[svh@fusionmi50 ~]$ spack load --first hip@4.2.0
==> Error: No compilers for operating system centos7 satisfy spec gcc@7.3.1

Try running spack compiler find which should automatically find the compilers installed. Usually you just have to do this once with a new spack setup. If it works, then you should have the following after running spack compiler list:

==> Available compilers
-- clang centos7-x86_64 -----------------------------------------
clang@12.0.0

-- gcc centos7-x86_64 -------------------------------------------
gcc@7.3.1  gcc@4.8.5

If not, then I can share the configuration I am using to put in ~/.spack/linux/compilers.yaml

quantumsteve · 2021-09-01T14:08:49Z

Looks like packages aren't being shared within our common spack setup 😕

[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh 
[svh@fusionmi50 ~]$  spack compiler list
==> Available compilers
-- gcc centos7-x86_64 -------------------------------------------
gcc@4.8.5

ckendrick · 2021-09-01T14:23:00Z

I think running source /opt/rh/devtoolset-7/enable before setting up the spack environment should get the newer version of gcc

quantumsteve · 2021-09-01T14:33:15Z

[svh@fusionmi50 ~]$ source /opt/rh/devtoolset-7/enable
[svh@fusionmi50 ~]$ source /opt/spack/share/spack/setup-env.sh 
[svh@fusionmi50 ~]$ spack compiler list
==> Available compilers
-- gcc centos7-x86_64 -------------------------------------------
gcc@4.8.5

ckendrick · 2021-09-01T14:36:31Z

Sorry, I forgot to mention that spack compiler find should be re-run as well.

quantumsteve · 2021-09-01T14:52:54Z

I also needed to run spack load cmake@3.21.0

quantumsteve · 2021-09-01T15:20:10Z

The following tests FAILED:
	  1 - adapt-test (Subprocess aborted)
	  2 - basis-test (Subprocess aborted)
	  3 - batch-test (Subprocess aborted)
	  4 - boundary_conditions-test (Subprocess aborted)
	  5 - coefficients-test (Subprocess aborted)
	  6 - distribution-test (Subprocess aborted)
	  7 - elements-test (Subprocess aborted)
	  8 - fast_math-test (Subprocess aborted)
	  9 - kronmult-test (Subprocess aborted)
	 10 - lib_dispatch-test (Subprocess aborted)
	 11 - matlab_utilities-test (Subprocess aborted)
	 12 - pde-test (Subprocess aborted)
	 13 - permutations-test (Subprocess aborted)
	 15 - quadrature-test (Subprocess aborted)
	 16 - solver-test (Subprocess aborted)
	 17 - tensors-test (Subprocess aborted)
	 18 - time_advance-test (Subprocess aborted)
	 20 - transformations-test (Subprocess aborted)
	 21 - kronmult_cuda-test (Subprocess aborted)

[svh@fusionmi50 build]$ ctest -R adapt-test --verbose
UpdateCTestConfiguration  from :/home/svh/asgard/build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/svh/asgard/build/DartConfiguration.tcl
Test project /home/svh/asgard/build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
    Start 1: adapt-test

1: Test command: /home/svh/asgard/build/adapt-tests
1: Test timeout computed to be: 10000000
1: adapt-tests: /home/svh/asgard/src/lib_dispatch.cpp:39: device_handler::device_handler(): Assertion `success == HIPBLAS_STATUS_SUCCESS' failed.
1/1 Test #1: adapt-test .......................Subprocess aborted***Exception:   0.17 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.17 sec

The following tests FAILED:
	  1 - adapt-test (Subprocess aborted)
Errors while running CTest
Output from these tests are in: /home/svh/asgard/build/Testing/Temporary/LastTest.log

ckendrick · 2021-09-01T15:31:52Z

rocm-device-libs should get loaded automatically when loading hip, you can use spack find --loaded to verify it is listed.

This seems to be a path issue, probably due to the abnormal directory structure from splitting rocm/hip into spack packages (instead of having everything installed to /opt/rocm). Setting the following environment variables should fix the rocm device lib path message. You might have to also purge the build directory and re-run cmake.

export HIP_DEVICE_LIB_PATH=/opt/spack/opt/spack/linux-centos7-zen/gcc-7.3.1/rocm-device-libs-4.2.0-elwhgtyne5wgof6m6mwrlconzda6epvi/amdgcn/bitcode
export DEVICE_LIB_PATH=${HIP_DEVICE_LIB_PATH}

ckendrick · 2021-09-01T15:48:29Z

Can you run rocminfo and hipconfig without any error? You may need to be added to the video group.

quantumsteve

Couple thoughts while reviewing these changes

quantumsteve · 2021-11-17T21:49:56Z

CMakeLists.txt

+  message(STATUS "HIP Libraries: ${HIP_LIBRARIES}")
+
+  if(ASGARD_PLATFORM_NVCC)
+    find_package(CUDA 9.0 REQUIRED)


Can we eliminate the old find_package(CUDA)?

quantumsteve · 2021-11-17T21:51:34Z

CMakeLists.txt

+  include_directories(SYSTEM ${HIP_INCLUDE_DIRS})
+  # assume this include path since HIP_INCLUDE_DIRS is not being set on nvidia platform
+  include_directories(SYSTEM "${HIP_PATH}/include")
+  include_directories(${HIPBLAS_INCLUDE_DIRS})


Can we use target_include_directories?

quantumsteve · 2021-11-17T21:52:31Z

CMakeLists.txt

+
+  # set source file language properties
+  if(ASGARD_PLATFORM_AMD)
+    #set_source_files_properties( src/device/kronmult_cuda.cpp PROPERTIES LANGUAGE HIP ) # should work after cmake 3.21 release?


Can we use this now that we require CMake 3.21?

Yes, I believe this should work now but I am not able to test it at the moment since the AMD machine is still down.

quantumsteve · 2021-11-18T17:47:42Z

src/batch_tests.cpp

+  P tol_factor = 1e-17;
+  if constexpr (resrc == resource::device)
+  {
+    tol_factor = 1e-7;


quantumsteve · 2021-11-18T17:49:07Z

CMakeLists.txt

+  if(ASGARD_PLATFORM_AMD)
+    target_link_libraries(tensors PRIVATE hip::device)
+  elseif(ASGARD_PLATFORM_NVCC)
+    target_link_libraries(tensors PRIVATE ${CUDA_LIBRARIES})


Does HIP not take care of linking against CUDA?

I haven't been able to get it to work automatically, but I might be missing something. enable_language(HIP) seems to cause issues on Nvidia. The closest I've gotten is on the kronmult PR, but that was setting the language to CUDA for each target which may not be the best solution.
The new changes I made is using hip_add_library and hip_add_executable (which may be worse than before?) but those still seem to be missing linking in the CUDA libraries.

quantumsteve · 2021-11-18T17:49:13Z

CMakeLists.txt

+  if(ASGARD_PLATFORM_AMD)
+    target_link_libraries(lib_dispatch PRIVATE hip::device)
+  elseif(ASGARD_PLATFORM_NVCC)
+    target_link_libraries(lib_dispatch PRIVATE ${CUDA_LIBRARIES})


Does HIP not take care of linking against CUDA?

mkstoyanov · 2022-09-27T16:57:15Z

CMakeLists.txt

@@ -78,7 +81,7 @@ option (ASGARD_PROFILE_PERF "enable profiling support for using linux perf" "")
 option (ASGARD_PROFILE_VALGRIND "enable profiling support for using valgrind" "")
 option (ASGARD_GRAPHVIZ_PATH "optional location of bin/ containing dot executable" "")
 option (ASGARD_IO_HIGHFIVE "Use the HighFive HDF5 header library for I/O" OFF)
-option (ASGARD_USE_CUDA "Optional CUDA support for asgard" OFF)
+option (ASGARD_USE_HIP "Optional HIP support for asgard" OFF)


Using CUDA through the HIP API is not a great idea at the moment. The biggest issue comes from the use of math-libraries such as cuBlas and rocBlas, where they don't fully mirror or port the capabilities (especially true for sparse calls). Lesser problems (but problems never the less) come from availability and support across platforms, Nvidia based systems do not have universal support for HIP, also optimizations and performance.

HIP and CUDA can sit side by side in the code and have only one flipped on/off. All we need (usually) is to change the abstraction of memory allocation and data movement, as well as the kernels which seldom require any change, i.e., we can use the same kernels, just compile them differently.

quantumsteve · 2023-04-19T13:48:30Z

@ckendrick @mkstoyanov I recommend closing this as we're unlikely to continue using the earlier kronmult implementation.

ckendrick force-pushed the feature/hip branch from bb47571 to 0d3e6c1 Compare August 17, 2021 16:19

ckendrick added 18 commits October 12, 2021 11:41

Initial hipify from perl script

79ab7a9

WIP HIP CMake configuration

f70d02b

Hipifying some missed parts

0e65742

Updating formatting for hipified files

3daf645

Minor cmake adjustments for hip

e2491dc

Add initial hip platform configuration

4b94cc8

Update cmake cuda options and add hipblas

e816103

CMake adjustments and initial platform options

0d98648

Add in nvidia platform update for hip 4.2

0a78c9e

Update tensor linking and missed cublas calls

b397160

Update hip compiler option variables, hip_add_library for kronmult

15a9d35

Update nvidia arch flag for kronmult_cuda

b0a81a1

Set hip_clang_path in cmake to find for amd platform

0889a35

Set the hip_clang_include_path needed for spack installs

3d2f466

Update hip build for amd platform

119cbb2

Add hcc platform def for backwards compatability

b9002c6

Add temporary workaround for kronmult hip interface

d08af1e

Temporarily suppress clang compiler warnings

3dc539d

ckendrick added 11 commits October 12, 2021 11:41

Fix shared flags for hip platform, amd gpu target archs

ef16b56

Move kronmult source properties before add_lib

a745fbe

Add check for empty matrix to avoid hipmemcpy2d error

e591cd9

Fix hardcoded amd clang version and update target flags

770a040

Modify include dirs to hide deprecated cuda messages

c318c89

Update device test tol for batch tests

cf7c04a

Fix clang formatting issues

d5f34ca

Re-enable kronmult for amd platforms

8c69e57

Rename hip platform from hcc to amd

49f45dc

Adjust lib dispatch device test tol for amd

8cd9e45

Clean up cmake, add hipblas version check for amd

2de03a3

ckendrick force-pushed the feature/hip branch 2 times, most recently from 34c65d6 to 2de03a3 Compare October 12, 2021 16:15

ckendrick added 6 commits October 12, 2021 13:48

Pass gpu arch to kronmult, set amd flags only on amd

c5ed1fd

Change kronmult fetch content to after configuration

c79997b

Fix kron linking on nvidia

91b0626

Consolidate gpu arch flags and reorder hip flags

acf6305

Add new register project arg when building openblas

37ebc25

Decrease batched gemv test tol for amd gpu

8909eab

ckendrick requested review from quantumsteve and cianciosa November 9, 2021 17:49

quantumsteve requested changes Nov 18, 2021

View reviewed changes

Change CMake HIP linking

ef3c11b

mkstoyanov reviewed Sep 27, 2022

View reviewed changes

quantumsteve mentioned this pull request Oct 17, 2022

explicit time advance portability #349

Closed

quantumsteve marked this pull request as draft January 6, 2023 15:06

mkstoyanov closed this Apr 19, 2023

HIP version of asgard #400

HIP version of asgard #400

Uh oh!

Conversation

ckendrick commented Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

What type(s) of changes does this code introduce?

Does this introduce a breaking change?

What systems has this change been tested on?

Checklist

Uh oh!

quantumsteve commented Aug 31, 2021

Uh oh!

quantumsteve commented Aug 31, 2021

Uh oh!

ckendrick commented Aug 31, 2021

Uh oh!

quantumsteve commented Sep 1, 2021

Uh oh!

ckendrick commented Sep 1, 2021

Uh oh!

quantumsteve commented Sep 1, 2021

Uh oh!

ckendrick commented Sep 1, 2021

Uh oh!

quantumsteve commented Sep 1, 2021

Uh oh!

quantumsteve commented Sep 1, 2021

Uh oh!

ckendrick commented Sep 1, 2021

Uh oh!

ckendrick commented Sep 1, 2021

Uh oh!

quantumsteve left a comment

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 17, 2021

Choose a reason for hiding this comment

Uh oh!

ckendrick Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 18, 2021

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ckendrick Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

quantumsteve Nov 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mkstoyanov Sep 27, 2022

Choose a reason for hiding this comment

Uh oh!

quantumsteve commented Apr 19, 2023

Uh oh!

Uh oh!

ckendrick commented Jul 23, 2021 •

edited

Loading

quantumsteve Nov 18, 2021 •

edited

Loading

quantumsteve Nov 18, 2021 •

edited

Loading