From XBeach to XBGPU #43
Replies: 3 comments
-
Pseudo-code of translation from multiloop to CUDA kernel |
Beta Was this translation helpful? Give feedback.
-
MethodThe model presented here is not intended as a substitute for XBeach rather intend to demonstrates the benefit and ease of running such a complex model on the GPU. Scientific calculation on the GPU are usually presented for data intensive problem but it was initially designed to do operation on pixel and is quite appropriate for grid calculations. XBeach was design to simulate erosion and inundation caused by extreme condition such as the one arising from hurricanes. The model has been used and tested for a wide range of conditions. The code was derived from the original Fortran XBeach code. However, at this point only the basic features of XBeach were transferred in GPU XBeach. These include the 3 main step of XBeach (1) the instationary wave solver, (2) the flow model, and (3) the morphological model (Including the sediment transport schemes). In addition, the GPU code is a simplified version of the XBeach original code for use with regular grids. Support for curvilinear grids and other important features of XBeach may be added in a later version. |
Beta Was this translation helpful? Give feedback.
-
Parallelization strategyTo be used on the GPU with CUDA the original code has to be broken into small functions (i.e. with a limited amount of instructions) than can fit on the GPU at a given time. The results in each of the main steps to be a list of function running on the GPU. As an example, to make the most of the parallel cores on the GPU the problem needs to be partitioned efficiently in a way that can be scaled to any GPU. In CUDA, problems are “partitioned into coarse sub-problem that can be solved independently in parallel by blocks of threads, and each sub-problem into finer pieces that can be solved cooperatively in parallel by all threads within the block” (Nvidia programming guide). In other words, the parallelizing strategy consists in defining the size of the block that practically and efficiently partitioned the problem, for a 2D depth average model such as XBeach, an obvious and rather naive partitioning strategy consists in splitting the model domain into blocks of sub-domains, In XBGPU the sub-domain size was chosen as 16x16 cells. The size of the sub-domain block can affect the performance of the model and is discussed later. |
Beta Was this translation helpful? Give feedback.
-
how was XBGPU translated
XBGPU is a rewrite of my favorite algorithm from XBeach in CUDA C. Below is a discussion on how I did go about it.
Beta Was this translation helpful? Give feedback.
All reactions