-
Notifications
You must be signed in to change notification settings - Fork 8
Notes and scripts for AMD profiling of dycore #1047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: update_dace_version
Are you sure you want to change the base?
Conversation
model/atmosphere/dycore/tests/dycore/integration_tests/test_benchmark_solve_nonhydro.py
Outdated
Show resolved
Hide resolved
model/atmosphere/dycore/tests/dycore/integration_tests/test_benchmark_solve_nonhydro.py
Outdated
Show resolved
Hide resolved
amd_scripts/install_icon4py_venv.sh
Outdated
| fi | ||
|
|
||
| # Install icon4py, gt4py, DaCe and other basic dependencies using uv | ||
| uv sync --extra all --python $(which python3.12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not install all the extras but maybe we properly add cupy-rocm7 as an extra to avoid line 29. I can work on that.
…fix' into amd_profiling
…osure_vars to fix the caching of the dycore programs
amd_scripts/benchmark_dycore.sh
Outdated
| --benchmark-warmup=on \ | ||
| --benchmark-warmup-iterations=30 \ | ||
| --backend=dace_gpu \ | ||
| --grid=icon_benchmark_regional \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| --grid=icon_benchmark_regional \ | |
| --grid=icon_benchmark_global \ |
Since global is our main target for now, maybe we can switch to that.
|
Mandatory Tests Please make sure you run these tests via comment before you merge!
Optional Tests To run benchmarks you can use:
To run tests and benchmarks with the DaCe backend you can use:
To run test levels ignored by the default test suite (mostly simple datatest for static fields computations) you can use:
For more detailed information please look at CI in the EXCLAIM universe. |
This Pull Request includes scripts to benchmark and profile the
dycore granuleas well as one of the most time consumingGT4Py Programs of it, thevertically_implicit_solver_at_predictor_step.We'll keep this PR open for interaction and keep it up-to-date with improvements.
The PR includes the following important files:
AMD_INTRODUCTION.md: Includes (hopefully) all the informations necessary to run the benchmark scripts for thedycore granuleand thevertically_implicit_solver_at_predictor_stepas well as an introduction onicon4py,GT4PyandDaCe. There are also some suggestions regarding how to view and understand the generated codeamd_scripts/install_icon4py_venv.sh: Script to installicon4pyalong with all the dependencies necessary to run the profilersamd_scripts/benchmark_dycore.sh: Sbatch script forBeverinto run and time theGT4Py Programs of thedycoreamd_scripts/benchmark_solver.sh: Sbatch script forBeverinto benchark and profile thevertically_implicit_solver_at_predictor_step. Looking at the profiles of the kernels generated by thisGT4Py programis the most interesting topic as it should improve the performance across most of the otherdycoreGT4Py Programs as wellCurrently, based on #1018 which points to GT4Py/main (which will become GT4Py v1.1.4 in the next week).