Skip to content

Conversation

@philipwjones
Copy link
Contributor

@philipwjones philipwjones commented Aug 23, 2018

Large refactoring and GPU optimization of the ocean equation of state routines. Minor performance improvements on KNL, traditional clusters, but enables significant GPU (>2x) improvement on Summit. Roundoff level differences with QU240 standalone tests on Cori and Summit. Did not check Jackett and McDougall reference values.

Changes include:

  • moving calculation of reference pressure to initialization
  • separating EOS calls to routines with and without expansion coefficients
  • merged some loops to reduce arrays to scalars
  • moved tracer modification to separate loop for optimization
  • changed loop limits to fixed nVertLevels rather than maxLevelCell
  • pushed extraction of pool variables up one call level
  • added OpenACC directives to enable computations on GPU
  • convert JM allowable T,S range to parameters
  • introduce enums for EOS types to reduce string manipulation
  • changed scalar pointers to scalar variables
  • corrections and updates to a number of comments
  • general code cleanup

@mark-petersen
Copy link
Contributor

@philipwjones thanks for your work on this. I would like to see this test: Start with QU240, run one step, write out full density field. Run that test before/after these changes. Take a point-wise diff and report max and average difference. That would be very convincing, and you may have already done it.

Also, could say a bit about running on GPUs? Can MPAS-Ocean currently run on GPUs? I thought not, because we need to add OpenACC directives for that to happen. If yes, which parts of the code runs on the GPUs?

@philipwjones
Copy link
Contributor Author

Will get those numbers today - most changes were actually b4b, but one of the loop changes caused a small difference from sqrt, so must have triggered a different version of the library call.

Regarding GPUs, we have a couple of different versions of the transport that are GPU-enabled, but we are still evaluating which version to use/commit. I've gotten pretty far along on vmix, but that's not quite ready yet either. We have other bits here and there. We added an OPENACC build flag to turn the GPU portions on, but those Makefile changes were not part of this commit since no one else has access to Summit and these changes will not be effective on Titan due to the higher transfer overhead on that machine.

@philipwjones
Copy link
Contributor Author

On Cori after one timestep:
Differences in field: density
max rel diff: 2.219540115956395E-015
mean rel diff: 5.773551112341405E-017
Where relative difference is abs((density1 - density2)/density1)
Summit's been down most of the day, but can do something similar with GPU vs CPU.

@mark-petersen
Copy link
Contributor

mark-petersen commented Sep 4, 2018

@philipwjones, could you add a blank line for line 2 of your commit message? Also, limit first line to 50 characters to appear on a concise one-liner.

If you use

git log --oneline

you will see that a lack of a blank line 2 makes the whole message appear. You can use a git commit --amend and then git push --force.

@mark-petersen
Copy link
Contributor

A few notes for new code additions: Doug wrote a nice tool to help us write uniform code. It is here:
https://github.com/MPAS-Dev/MPAS-Tools/tree/master/source_code_processing/mpas_source_linter
and can be run with:

linter -f file.F

it produces a new file that notifies you of 'delint' clean-up items, that sometimes cause a problem on odd compilers. These can also be helpful for brand new code:

sed -i 's/    /\t/g' $file    # change space to tab, for *.xml files
sed -i 's/[ \t]*$//g' $file   # remove trailing white spaces

though it is best to clean up just the new code you are submitting, and not include white space alterations to lots of other code in this PR.

Copy link
Contributor

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested on wolf with gnu and intel, testing debug and optimized for both, using the MPAS-O nightly regression suite. Everything passed. Compared to head of ocean/develop, and max differences were order 1e-12, a few were 1e-9 after six time steps. This is acceptable.

@mark-petersen
Copy link
Contributor

For something non-bfb like this, we need to make an E3SM PR with this updated MPAS submodule and test it. I would say an EC60to30 G case for two years, compared to the current repo, would be fine. @jonbob, this would be a nice task for Kristin to work on. (As an aside, we also have new values for EC60to30 with dt=30 minutes and revised del2 & del4 from Mat that could be tested with that bunch).

@philipwjones I am merging a bunch of bfb compass updates to ocean/develop this week. I'd like to merge this one after it is tested in E3SM.

@philipwjones
Copy link
Contributor Author

@mark-petersen - fixed the commit message. Also fixed an untyped constant that lint found and some trailing whitespace that lint also complained about.

pwjones added 2 commits December 18, 2018 08:06
… routines.

Changes include:
   - moving calculation of reference pressure to initialization
   - separated EOS calls to routines with and without expansion coefficients
   - merged some loops to reduce arrays to scalars
   - moved tracer modification to separate loop for optimization
   - changed loop limits to fixed nVertLevels rather than maxLevelCell
   - pushed extraction of pool variables up one call level
   - added OpenACC directives to enable computations on GPU
   - convert JM allowable T,S range to parameters
   - introduce enums for EOS types to reduce string manipulation
   - changed scalar pointers to scalar variables
   - corrections and updates to a number of comments
   - general code cleanup
@mark-petersen
Copy link
Contributor

@philipwjones I rebased and tested. Passes nightly regression suite on grizzly with gnu and intel, optimized and debug. Tests with nonlinear EOS do not match bit-for-bit with ocean/develop. Differ by max of 1e-9 after 6 time steps, which is expected.

Can I merge and make PR into E3SM?

@philipwjones
Copy link
Contributor Author

@mark-petersen - fine by me

@mark-petersen mark-petersen removed the request for review from jonbob December 19, 2018 04:32
@mark-petersen mark-petersen merged commit bf3cb54 into MPAS-Dev:ocean/develop Dec 19, 2018
mark-petersen added a commit that referenced this pull request Dec 19, 2018
Large refactoring and GPU optimization of the ocean equation of state routines.
Minor performance improvements on KNL, traditional clusters, but enables
significant GPU (>2x) improvement on Summit. Roundoff level differences with
QU240 standalone tests on Cori and Summit. Did not check Jackett and McDougall
reference values.

Changes include:
   - moving calculation of reference pressure to initialization
   - separating EOS calls to routines with and without expansion coefficients
   - merged some loops to reduce arrays to scalars
   - moved tracer modification to separate loop for optimization
   - changed loop limits to fixed nVertLevels rather than maxLevelCell
   - pushed extraction of pool variables up one call level
   - added OpenACC directives to enable computations on GPU
   - convert JM allowable T,S range to parameters
   - introduce enums for EOS types to reduce string manipulation
   - changed scalar pointers to scalar variables
   - corrections and updates to a number of comments
   - general code cleanup
mark-petersen added a commit that referenced this pull request Jan 18, 2019
This reverts commit 8ec75ac, reversing
changes made to fd71e64.
mark-petersen added a commit that referenced this pull request Jan 18, 2019
* ocean/develop:
  Revert "Merge PR #69 'ocean/EOSopt' into ocean/develop"
@mark-petersen
Copy link
Contributor

@philipwjones, this did not pass the tests on E3SM next branch, so we had to revert this merge in order to proceed with other pull requests. See comments here:
https://acmeclimate.slack.com/archives/C04B3QGEQ/p1547562515073700
https://acmeclimate.slack.com/archives/C04B3QGEQ/p1547570643077900

It passed my G-case tests on edison, but it's failing these:

  • PET_Ln9_PS.ne30_oECv3_ICG.A_WCYCL1850S.sandiatoss3_intel.allactive-mach-pet
  • SMS_P12x2.ne4_oQU240.A_WCYCL1850.sandiatoss3_intel.allactive-mach_mods

mark-petersen added a commit to amametjanov/MPAS-Model that referenced this pull request Apr 10, 2019
mark-petersen added a commit that referenced this pull request Apr 12, 2019
This is a re-merge of #69 plus a fix in #170.

Large refactoring and GPU optimization of the ocean equation of state
routines. Minor performance improvements on KNL, traditional clusters,
but enables significant GPU (>2x) improvement on Summit. Roundoff level
differences with QU240 standalone tests on Cori and Summit. Did not
check Jackett and McDougall reference values.

Changes include:

moving calculation of reference pressure to initialization
separating EOS calls to routines with and without expansion coefficients
merged some loops to reduce arrays to scalars
moved tracer modification to separate loop for optimization
changed loop limits to fixed nVertLevels rather than maxLevelCell
pushed extraction of pool variables up one call level
added OpenACC directives to enable computations on GPU
convert JM allowable T,S range to parameters
introduce enums for EOS types to reduce string manipulation
changed scalar pointers to scalar variables
corrections and updates to a number of comments
general code cleanup
mark-petersen added a commit that referenced this pull request Apr 12, 2019
  Reduce threading barriers in ocean
  Update ocean BGC threading
  Update threading in ocean EOS allocates
  Revert "Revert "Merge PR #69 'ocean/EOSopt' into ocean/develop""
mark-petersen added a commit that referenced this pull request May 27, 2019
  Reduce threading barriers in ocean
  Update ocean BGC threading
  Update threading in ocean EOS allocates
  Revert "Revert "Merge PR #69 'ocean/EOSopt' into ocean/develop""
mark-petersen added a commit that referenced this pull request May 27, 2019
* ocean/develop:
  Reduce threading barriers in ocean
  Update ocean BGC threading
  Update threading in ocean EOS allocates
  Revert "Revert "Merge PR #69 'ocean/EOSopt' into ocean/develop""
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 29, 2019
MPAS-Ocean threading and GPU update

This updates OpenMP and OpenACC calls in MPAS-Ocean, including:
* GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170
* Update ocean BGC threading: MPAS-Dev/MPAS-Model#152
* Reduce threading barriers: MPAS-Dev/MPAS-Model#194
It is not bfb due to change in order of operations. It should not be climate
changing.

Fixes #2860

[non-BFB]
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 30, 2019
MPAS-Ocean threading and GPU update

This PR updates OpenMP and OpenACC calls in MPAS-Ocean, including:
* GPU optimization in EOS: MPAS-Dev/MPAS-Model#69 and MPAS-Dev/MPAS-Model#170
* Update ocean BGC threading: MPAS-Dev/MPAS-Model#152
* Reduce threading barriers: MPAS-Dev/MPAS-Model#194
It is not bfb due to change in order of operations. It should not be climate
changing.

Fixes #2860

[non-BFB]
@philipwjones philipwjones deleted the ocean/EOSopt branch March 10, 2020 12:40
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
Large refactoring and GPU optimization of the ocean equation of state routines.
Minor performance improvements on KNL, traditional clusters, but enables
significant GPU (>2x) improvement on Summit. Roundoff level differences with
QU240 standalone tests on Cori and Summit. Did not check Jackett and McDougall
reference values.

Changes include:
   - moving calculation of reference pressure to initialization
   - separating EOS calls to routines with and without expansion coefficients
   - merged some loops to reduce arrays to scalars
   - moved tracer modification to separate loop for optimization
   - changed loop limits to fixed nVertLevels rather than maxLevelCell
   - pushed extraction of pool variables up one call level
   - added OpenACC directives to enable computations on GPU
   - convert JM allowable T,S range to parameters
   - introduce enums for EOS types to reduce string manipulation
   - changed scalar pointers to scalar variables
   - corrections and updates to a number of comments
   - general code cleanup
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
This reverts commit 8ec75ac, reversing
changes made to fd71e64.
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
This is a re-merge of MPAS-Dev#69 plus a fix in MPAS-Dev#170.

Large refactoring and GPU optimization of the ocean equation of state
routines. Minor performance improvements on KNL, traditional clusters,
but enables significant GPU (>2x) improvement on Summit. Roundoff level
differences with QU240 standalone tests on Cori and Summit. Did not
check Jackett and McDougall reference values.

Changes include:

moving calculation of reference pressure to initialization
separating EOS calls to routines with and without expansion coefficients
merged some loops to reduce arrays to scalars
moved tracer modification to separate loop for optimization
changed loop limits to fixed nVertLevels rather than maxLevelCell
pushed extraction of pool variables up one call level
added OpenACC directives to enable computations on GPU
convert JM allowable T,S range to parameters
introduce enums for EOS types to reduce string manipulation
changed scalar pointers to scalar variables
corrections and updates to a number of comments
general code cleanup
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
  Reduce threading barriers in ocean
  Update ocean BGC threading
  Update threading in ocean EOS allocates
  Revert "Revert "Merge PR MPAS-Dev#69 'ocean/EOSopt' into ocean/develop""
mark-petersen added a commit to mark-petersen/MPAS-Model that referenced this pull request Jan 11, 2021
This reverts commit 8ec75ac, reversing
changes made to fd71e64.
mark-petersen added a commit to mark-petersen/MPAS-Model that referenced this pull request Jan 11, 2021
mark-petersen added a commit to mark-petersen/MPAS-Model that referenced this pull request Jan 11, 2021
  Reduce threading barriers in ocean
  Update ocean BGC threading
  Update threading in ocean EOS allocates
  Revert "Revert "Merge PR MPAS-Dev#69 'ocean/EOSopt' into ocean/develop""
barlage pushed a commit to barlage/MPAS-Model that referenced this pull request Jan 21, 2025
* Adds code to determine whether to use aerosols (nwfa and nifa) from RAP or
the monthly climatology data.

 * Adds a config variable, config_tempo_rap, to the core_init_atmosphere
   Registry that must be set to true in the init_atmosphere namelist
   to use RAP aerosols.

 * Adds a logical function that check for config_tempo_rap = true and
   non-zero values of nwfa and nifa from RAP. This function will control
   whether to use RAP aerosols or the monthly climatology.

* Modifies logic for creation of ICs/LBCs so that users can use aerosol
information from the first guess field (e.g., RAP) instead of the
monthly climatology. This aerosol information, nwfa and nifa, is needed
for the aerosol-aware TEMPO microphysics.

* Adds optional argument lbc_state to the use_rap_aerosols function

Use_rap_aerosols will now check for existing aerosols in the initial conditions
if lbc_state is not present. Otherwise, it will check for existing aerosols
in lbc_state (lateral boundary condition data type)

* Fix missing parentheses in if statement

* Initialize tempo_aerosolaware package so that nwfa and nifa
can be added to IC/LBC files.

* Changes lbc code to use lbc-specific indexes when adding variables to
lbc_state.

 * For qv, the lbc index is changed from index_qv to index_lbc_qv
   and state is changed to lbc_state

 * This is done for all LBC variables

 * Fixes a bug that caused an LBC variable (nwfa) to populate
   the incorrect variable name (lbc_nccn) since index_nwfa is
   NOT equal to index_lbc_nwfa

* Fix typos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants