explicit time advance portability #349

bmcdanie · 2020-12-02T21:06:45Z

currently, the explicit time advance (the core of the code) runs via calls into the kronmult library: https://github.com/project-asgard/kronmult. some kernels in https://github.com/project-asgard/asgard/blob/develop/src/device/kronmult_cuda.cpp are used to set up for calls into the library.

both the main kronmult code and the setup kernels are written as cuda kernels, with a fallback to OpenMP. To enhance portability, we could try a number of higher level approaches:

nvidia hpc sdk: https://developer.nvidia.com/hpc-sdk allows parallel algorithms https://en.cppreference.com/w/cpp/experimental/parallelism to be run on the accelerator. our code may not fit this paradigm, but may be worth exploring.

hipify kernels: https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html.

others? kokkos, OpenCL, etc.

quantumsteve · 2022-10-17T21:50:27Z

@ckendrick Is this part of #400 and project-asgard/kronmult#13?

ckendrick · 2022-10-18T12:15:40Z

@quantumsteve Yes this is

mkstoyanov · 2024-06-27T18:22:11Z

a bit outdated: kronmult was completely revamped and will be again

will make portable when the need arises

bmcdanie added the rse ideas label Dec 2, 2020

mkstoyanov closed this as completed Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explicit time advance portability #349

explicit time advance portability #349

bmcdanie commented Dec 2, 2020

quantumsteve commented Oct 17, 2022

ckendrick commented Oct 18, 2022

mkstoyanov commented Jun 27, 2024

explicit time advance portability #349

explicit time advance portability #349

Comments

bmcdanie commented Dec 2, 2020

quantumsteve commented Oct 17, 2022

ckendrick commented Oct 18, 2022

mkstoyanov commented Jun 27, 2024