-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cray workaround removal #700
Cray workaround removal #700
Conversation
Some of the CCE specific compiler directives are going to hang around. Mark those with _CRAYFTN instead of a new macro.
Removing this procedure pointer is necessary due to a CCE bug, but has the potential to be a breaking change. Other similar changes will be made elsewhere but this specific infrastructure is only being used in a private branch, so isolating it here to avoid confusion.
The main thrust of this commit factors out certian procedure pointers that are causing a bug in the Cray compiler that ignores all `acc declare` statements. This is annoying workaround but not catastrophic given that procedure pointers could be refactored into dispatch subroutines easily and runtime polymorpism was minimal. There are a few other assorted tweaks in the non-CRAY_ACC_WAR logic. WARNING: more complicated boundary conditions are broken in this commit! The next commit will fix them.
…ace CCE bug For some reason CCE OpenACC is failing to keep these specific variables from colliding between m_cbc and m_riemann_solvers. Every other variable type seems fine, but some implicit link must be causing a collision. The trivial bug ugly workaround is rename the variables to "_l" local versions to avoid the collision. If we can ever figure out and fix this bug we could undo this commit.
Somehow missed these the first time around.
This removes all the hacks for the critical CCE bug around allocatable module arrays in ACC routines. Other more localized hacks for CCE bugs remains in the code, but the majority of the source developed for other machines should work on Frontier now.
It seems pretty clear I took something out that was required for the NVHPC builds. Seeing if I can build locally for A100 to figure out what. Right now I'm suspecting it was |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #700 +/- ##
==========================================
- Coverage 43.04% 42.84% -0.20%
==========================================
Files 61 61
Lines 16062 16281 +219
Branches 1798 1891 +93
==========================================
+ Hits 6914 6976 +62
- Misses 8158 8260 +102
- Partials 990 1045 +55 ☔ View full report in Codecov by Sentry. |
I was too agressive pulling these out. Why it didn't break CCE is a puzzle..
It was |
Thanks, @abbotts! |
src/simulation/m_data_output.fpp
Outdated
|
||
integer, intent(in) :: t_step | ||
|
||
if (parallel_io .neqv. .true.) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.not. parallel_io
perhaps?
src/simulation/m_start_up.fpp
Outdated
@@ -1353,10 +1353,8 @@ contains | |||
! Associate pointers for serial or parallel I/O | |||
if (parallel_io .neqv. .true.) then | |||
s_read_data_files => s_read_serial_data_files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just wrap all of this logic in a subroutine instead of only the write
subroutine? (include the read
one as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks so much! i had a question or two, but mostly just figured that we can make the logic in the code a bit more consistent with a couple more additional changes.
Will merge once CI runs. |
@sbryngelson , I'm not sure why the GA Tech CPU bench run failed. Looking at the logs the PR and Master benchmarks both built and ran fine, but then the master summary yaml file generation failed with:
|
Actually, the |
Description
This PR removes the CRAY_ACC_WAR workaround code that we put in to deal with CCE not handling
acc declare create
on allocatable module arrays used in device subroutines. Some additional but more localized workarounds were required to make this removal work completely, but they are less offensive than the CRAY_ACC_WAR code and integrated directly.Type of change
Please delete options that are not relevant.
Scope
If you cannot check the above box, please split your PR into multiple PRs that each have a common goal.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration
./mfc.sh test --gpu
test suite on an HPE internal cluster mirroring Frontier's software environment./mfc.sh test --gpu
on Frontier proper./mfc.sh test
on ARM MacBookTest Configuration:
Checklist
docs/
)examples/
that demonstrate my new feature performing as expected.They run to completion and demonstrate "interesting physics"
./mfc.sh format
before committing my codeIf your code changes any code source files (anything in
src/simulation
)To make sure the code is performing as expected on GPU devices, I have:
nvtx
ranges so that they can be identified in profiles./mfc.sh run XXXX --gpu -t simulation --nsys
, and have attached the output file (.nsys-rep
) and plain text results to this PR./mfc.sh run XXXX --gpu -t simulation --omniperf
, and have attached the output file and plain text results to this PR.