Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicit time advance portability #349

Closed
bmcdanie opened this issue Dec 2, 2020 · 3 comments
Closed

explicit time advance portability #349

bmcdanie opened this issue Dec 2, 2020 · 3 comments

Comments

@bmcdanie
Copy link
Collaborator

bmcdanie commented Dec 2, 2020

currently, the explicit time advance (the core of the code) runs via calls into the kronmult library: https://github.com/project-asgard/kronmult. some kernels in https://github.com/project-asgard/asgard/blob/develop/src/device/kronmult_cuda.cpp are used to set up for calls into the library.

both the main kronmult code and the setup kernels are written as cuda kernels, with a fallback to OpenMP. To enhance portability, we could try a number of higher level approaches:

nvidia hpc sdk: https://developer.nvidia.com/hpc-sdk allows parallel algorithms https://en.cppreference.com/w/cpp/experimental/parallelism to be run on the accelerator. our code may not fit this paradigm, but may be worth exploring.

hipify kernels: https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-porting-guide.html.

others? kokkos, OpenCL, etc.

@quantumsteve
Copy link
Collaborator

@ckendrick Is this part of #400 and project-asgard/kronmult#13?

@ckendrick
Copy link
Collaborator

@quantumsteve Yes this is

@mkstoyanov
Copy link
Collaborator

a bit outdated: kronmult was completely revamped and will be again

will make portable when the need arises

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants