Skip to content

Conversation

@amametjanov
Copy link
Contributor

Update threading in ocean EOS to fix issues in multi-threaded runs. Tested with SMS_P12x2.ne4_oQU240.A_WCYCL1850.anvil_intel.allactive-mach_mods.

[BFB]

@mark-petersen
Copy link
Contributor

@amametjanov Thanks for adding code. Before this PR, does the code potentially cause an incorrect answer, or just slower performance with threading?

@philipwjones how does this overlay with your performance work on EOS - any preferred order for merging?

@philipwjones
Copy link
Contributor

@mark-petersen This is fixing a bug in my new EOS version. I had placed allocates of some temps within an omp single construct when shared allocated arrays must be allocated by the master thread. For some reason, a lot of machines were fine with the version I had committed, but it correctly failed on the sandiatoss machine during testing. This should fix that issue while still retaining the performance improvement. @amametjanov can correct any misstatement or add any additional detail. I reviewed the mods - they simply replace the single directive with a master directive and add the required barriers that were implicit in single - and this should be merged after my earlier (reverted) PR for the EOS change.

Copy link
Contributor

@philipwjones philipwjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See earlier comment. This fixes a bug in the prior EOS GPU optimization by replacing omp single regions with omp master regions. Did not test since I don't have access to the only machine where the previous version failed. But by inspection, this should have identical behavior.

@amametjanov
Copy link
Contributor Author

Yes, it should go after Phil's PR.

@mark-petersen
Copy link
Contributor

These are next in line:
Remerge #69
Merge this PR.
Test in MPAS-O, then E3SM. Is there anything I should know, or is it all ready to go?

@philipwjones
Copy link
Contributor

Should all be good to go.

@mark-petersen
Copy link
Contributor

Sorry for the delay. E3SM-Project/E3SM#2843 got higher priority. I will do all the threading-related PRs after that, so if you have other changes or potential PRs to add for threading, now is a good time to make them.

@mark-petersen
Copy link
Contributor

@amametjanov this PR is on top of #69 which was reverted. I'd like to make this PR the re-merge of the #69 commits and your commit on top of it, since they are all related. May I do that, and then push to this branch?
You need to change the target branch of this PR to ocean/develop.

@amametjanov amametjanov changed the base branch from e3sm/develop to ocean/develop April 10, 2019 16:55
@amametjanov
Copy link
Contributor Author

No problem. Done.

@mark-petersen
Copy link
Contributor

OK, I rebased, added #69, and put the 'fix' commit after that. @philipwjones and @amametjanov please browse the code and see if it looks right.

Copy link
Contributor

@philipwjones philipwjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like everything's there to me.

mark-petersen added a commit that referenced this pull request Apr 12, 2019
This is a re-merge of #69 plus a fix in #170.

Large refactoring and GPU optimization of the ocean equation of state
routines. Minor performance improvements on KNL, traditional clusters,
but enables significant GPU (>2x) improvement on Summit. Roundoff level
differences with QU240 standalone tests on Cori and Summit. Did not
check Jackett and McDougall reference values.

Changes include:

moving calculation of reference pressure to initialization
separating EOS calls to routines with and without expansion coefficients
merged some loops to reduce arrays to scalars
moved tracer modification to separate loop for optimization
changed loop limits to fixed nVertLevels rather than maxLevelCell
pushed extraction of pool variables up one call level
added OpenACC directives to enable computations on GPU
convert JM allowable T,S range to parameters
introduce enums for EOS types to reduce string manipulation
changed scalar pointers to scalar variables
corrections and updates to a number of comments
general code cleanup
@mark-petersen mark-petersen merged commit 243ebe9 into MPAS-Dev:ocean/develop May 27, 2019
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 29, 2019
MPAS-Ocean threading and GPU update

This updates OpenMP and OpenACC calls in MPAS-Ocean, including:
* GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170
* Update ocean BGC threading: MPAS-Dev/MPAS-Model#152
* Reduce threading barriers: MPAS-Dev/MPAS-Model#194
It is not bfb due to change in order of operations. It should not be climate
changing.

Fixes #2860

[non-BFB]
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 30, 2019
MPAS-Ocean threading and GPU update

This PR updates OpenMP and OpenACC calls in MPAS-Ocean, including:
* GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170
* Update ocean BGC threading: MPAS-Dev/MPAS-Model#152
* Reduce threading barriers: MPAS-Dev/MPAS-Model#194
It is not bfb due to change in order of operations. It should not be climate
changing.

Fixes #2860

[non-BFB]
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
This is a re-merge of MPAS-Dev#69 plus a fix in MPAS-Dev#170.

Large refactoring and GPU optimization of the ocean equation of state
routines. Minor performance improvements on KNL, traditional clusters,
but enables significant GPU (>2x) improvement on Summit. Roundoff level
differences with QU240 standalone tests on Cori and Summit. Did not
check Jackett and McDougall reference values.

Changes include:

moving calculation of reference pressure to initialization
separating EOS calls to routines with and without expansion coefficients
merged some loops to reduce arrays to scalars
moved tracer modification to separate loop for optimization
changed loop limits to fixed nVertLevels rather than maxLevelCell
pushed extraction of pool variables up one call level
added OpenACC directives to enable computations on GPU
convert JM allowable T,S range to parameters
introduce enums for EOS types to reduce string manipulation
changed scalar pointers to scalar variables
corrections and updates to a number of comments
general code cleanup
dustinswales pushed a commit to dustinswales/MPAS-Model that referenced this pull request Dec 2, 2025
… platforms (MPAS-Dev#170)

* add an ifort_icx compiling option for GaeaC6 and similar platforms

* bump version to 8.3.1-2.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants