-
Notifications
You must be signed in to change notification settings - Fork 390
Update threading in ocean EOS #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update threading in ocean EOS #170
Conversation
|
@amametjanov Thanks for adding code. Before this PR, does the code potentially cause an incorrect answer, or just slower performance with threading? @philipwjones how does this overlay with your performance work on EOS - any preferred order for merging? |
|
@mark-petersen This is fixing a bug in my new EOS version. I had placed allocates of some temps within an omp single construct when shared allocated arrays must be allocated by the master thread. For some reason, a lot of machines were fine with the version I had committed, but it correctly failed on the sandiatoss machine during testing. This should fix that issue while still retaining the performance improvement. @amametjanov can correct any misstatement or add any additional detail. I reviewed the mods - they simply replace the single directive with a master directive and add the required barriers that were implicit in single - and this should be merged after my earlier (reverted) PR for the EOS change. |
philipwjones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See earlier comment. This fixes a bug in the prior EOS GPU optimization by replacing omp single regions with omp master regions. Did not test since I don't have access to the only machine where the previous version failed. But by inspection, this should have identical behavior.
|
Yes, it should go after Phil's PR. |
|
These are next in line: |
|
Should all be good to go. |
|
Sorry for the delay. E3SM-Project/E3SM#2843 got higher priority. I will do all the threading-related PRs after that, so if you have other changes or potential PRs to add for threading, now is a good time to make them. |
|
@amametjanov this PR is on top of #69 which was reverted. I'd like to make this PR the re-merge of the #69 commits and your commit on top of it, since they are all related. May I do that, and then push to this branch? |
|
No problem. Done. |
f6f31e6 to
243ebe9
Compare
|
OK, I rebased, added #69, and put the 'fix' commit after that. @philipwjones and @amametjanov please browse the code and see if it looks right. |
philipwjones
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like everything's there to me.
This is a re-merge of #69 plus a fix in #170. Large refactoring and GPU optimization of the ocean equation of state routines. Minor performance improvements on KNL, traditional clusters, but enables significant GPU (>2x) improvement on Summit. Roundoff level differences with QU240 standalone tests on Cori and Summit. Did not check Jackett and McDougall reference values. Changes include: moving calculation of reference pressure to initialization separating EOS calls to routines with and without expansion coefficients merged some loops to reduce arrays to scalars moved tracer modification to separate loop for optimization changed loop limits to fixed nVertLevels rather than maxLevelCell pushed extraction of pool variables up one call level added OpenACC directives to enable computations on GPU convert JM allowable T,S range to parameters introduce enums for EOS types to reduce string manipulation changed scalar pointers to scalar variables corrections and updates to a number of comments general code cleanup
MPAS-Ocean threading and GPU update This updates OpenMP and OpenACC calls in MPAS-Ocean, including: * GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170 * Update ocean BGC threading: MPAS-Dev/MPAS-Model#152 * Reduce threading barriers: MPAS-Dev/MPAS-Model#194 It is not bfb due to change in order of operations. It should not be climate changing. Fixes #2860 [non-BFB]
MPAS-Ocean threading and GPU update This PR updates OpenMP and OpenACC calls in MPAS-Ocean, including: * GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170 * Update ocean BGC threading: MPAS-Dev/MPAS-Model#152 * Reduce threading barriers: MPAS-Dev/MPAS-Model#194 It is not bfb due to change in order of operations. It should not be climate changing. Fixes #2860 [non-BFB]
This is a re-merge of MPAS-Dev#69 plus a fix in MPAS-Dev#170. Large refactoring and GPU optimization of the ocean equation of state routines. Minor performance improvements on KNL, traditional clusters, but enables significant GPU (>2x) improvement on Summit. Roundoff level differences with QU240 standalone tests on Cori and Summit. Did not check Jackett and McDougall reference values. Changes include: moving calculation of reference pressure to initialization separating EOS calls to routines with and without expansion coefficients merged some loops to reduce arrays to scalars moved tracer modification to separate loop for optimization changed loop limits to fixed nVertLevels rather than maxLevelCell pushed extraction of pool variables up one call level added OpenACC directives to enable computations on GPU convert JM allowable T,S range to parameters introduce enums for EOS types to reduce string manipulation changed scalar pointers to scalar variables corrections and updates to a number of comments general code cleanup
… platforms (MPAS-Dev#170) * add an ifort_icx compiling option for GaeaC6 and similar platforms * bump version to 8.3.1-2.3
Update threading in ocean EOS to fix issues in multi-threaded runs. Tested with
SMS_P12x2.ne4_oQU240.A_WCYCL1850.anvil_intel.allactive-mach_mods.[BFB]