From XBeach to XBGPU #43

CyprienBosserelle · 2022-08-14T23:50:29Z

CyprienBosserelle
Aug 14, 2022
Maintainer

how was XBGPU translated

XBGPU is a rewrite of my favorite algorithm from XBeach in CUDA C. Below is a discussion on how I did go about it.

CyprienBosserelle · 2022-08-14T23:55:21Z

CyprienBosserelle
Aug 14, 2022
Maintainer Author

Pseudo-code of translation from multiloop to CUDA kernel

0 replies

CyprienBosserelle · 2022-08-14T23:58:22Z

CyprienBosserelle
Aug 14, 2022
Maintainer Author

Method

The model presented here is not intended as a substitute for XBeach rather intend to demonstrates the benefit and ease of running such a complex model on the GPU. Scientific calculation on the GPU are usually presented for data intensive problem but it was initially designed to do operation on pixel and is quite appropriate for grid calculations. XBeach was design to simulate erosion and inundation caused by extreme condition such as the one arising from hurricanes. The model has been used and tested for a wide range of conditions.

The code was derived from the original Fortran XBeach code. However, at this point only the basic features of XBeach were transferred in GPU XBeach. These include the 3 main step of XBeach (1) the instationary wave solver, (2) the flow model, and (3) the morphological model (Including the sediment transport schemes). In addition, the GPU code is a simplified version of the XBeach original code for use with regular grids. Support for curvilinear grids and other important features of XBeach may be added in a later version.

0 replies

CyprienBosserelle · 2022-08-15T01:07:10Z

CyprienBosserelle
Aug 15, 2022
Maintainer Author

Parallelization strategy

To be used on the GPU with CUDA the original code has to be broken into small functions (i.e. with a limited amount of instructions) than can fit on the GPU at a given time. The results in each of the main steps to be a list of function running on the GPU. As an example, to make the most of the parallel cores on the GPU the problem needs to be partitioned efficiently in a way that can be scaled to any GPU. In CUDA, problems are “partitioned into coarse sub-problem that can be solved independently in parallel by blocks of threads, and each sub-problem into finer pieces that can be solved cooperatively in parallel by all threads within the block” (Nvidia programming guide). In other words, the parallelizing strategy consists in defining the size of the block that practically and efficiently partitioned the problem, for a 2D depth average model such as XBeach, an obvious and rather naive partitioning strategy consists in splitting the model domain into blocks of sub-domains, In XBGPU the sub-domain size was chosen as 16x16 cells. The size of the sub-domain block can affect the performance of the model and is discussed later.
In CUDA programming language, kernels can be defined similar to functions that are executed in parallel on each treads. Each kernel can only contain a limited amount of instruction (dependant on the GPU class). For XBGPU we created a separate kernel for each function and/or each loop through the domain used in XBeach. This concept is illustrated by the pseudo-code shown above.
Within each kernel the calculation is similar than in the Fortran code loops as illustrated above which shows the pseudo-code for the calculation of the water level slopes both models. The main difference in the models is that the loop through the grid domain are implicit in the GPU code. In addition, a particular attention has been made to limit the number of if statement and to make the most of the shared memory on the GPU.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From XBeach to XBGPU #43

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

From XBeach to XBGPU #43

CyprienBosserelle Aug 14, 2022 Maintainer

how was XBGPU translated

Replies: 3 comments

CyprienBosserelle Aug 14, 2022 Maintainer Author

Pseudo-code of translation from multiloop to CUDA kernel

CyprienBosserelle Aug 14, 2022 Maintainer Author

Method

CyprienBosserelle Aug 15, 2022 Maintainer Author

Parallelization strategy

CyprienBosserelle
Aug 14, 2022
Maintainer

CyprienBosserelle
Aug 14, 2022
Maintainer Author

CyprienBosserelle
Aug 14, 2022
Maintainer Author

CyprienBosserelle
Aug 15, 2022
Maintainer Author