Skip to content

Ocean/tracer advection optimization#519

Closed
mattdturner wants to merge 8 commits intoMPAS-Dev:ocean/developfrom
mattdturner:ocean/tracer_transport_optimization
Closed

Ocean/tracer advection optimization#519
mattdturner wants to merge 8 commits intoMPAS-Dev:ocean/developfrom
mattdturner:ocean/tracer_transport_optimization

Conversation

@mattdturner
Copy link
Collaborator

@mattdturner mattdturner commented Apr 14, 2020

Optimizations to tracer transport that include

  • Removing pointer retrievals
  • Removing unused variables and calculations
  • Trimming argument lists to remove Pool variables (no longer used in these routines)
  • Add the use of ocnMesh (relies on PR New ocean mesh structure with GPU replication #496 being merged)

This PR requires the following PRs be merged first:

Some of the changes in those PRs were either git cherry-pick'ed or manually added. As a result, once those PRs are merged there might need to be some commits squashed or changes reverted in this PR.

@mattdturner
Copy link
Collaborator Author

This was tested using a QU240 test case, and a EC60to30 test case (both generated via Compass). The results are bit-for-bit for debug cases (and should be for optimized cases as well) with Intel compiler on Cori.

As a result of the optimizations in this PR (specifically, removing the unused calculations for computeBudgets), there is a reduction of the time spent in tracer adv by 16.4% (average of 3 runs for both baseline and optimized)

@mattdturner
Copy link
Collaborator Author

I updated this PR to include the recent changes to PR #496 and the necessary updates to tracer advection resulting from the changes in #496

Copy link
Contributor

@philipwjones philipwjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was b4b on Summit with PGI in optimized mode for QU240. Visual inspection looks good.

@mattdturner
Copy link
Collaborator Author

mattdturner commented Sep 10, 2020

I ran into some issues resolving merge conflicts during rebase, so instead I started over and re-applied the relevant changes.

I had to leave the computeBudgets code in since it calculates values used elsewhere. I either missed that previously, or there were recent changes that now use the arrays. As a result, the performance improvement drops from 16.4% in tracer adv to 1.6%. This PR is still necessary, though, since it updates the tracer advection routines to use the ocn_mesh module.

This is BFB for the nightly regression suite when compiled w/ intel (both DEBUG and OPT).

I need to test this in E3SM before its deemed ready.

@mattdturner
Copy link
Collaborator Author

mattdturner commented Sep 11, 2020

PASSes

SMS_D.T62_oQU120_ais20.MPAS_LISIO_TEST.cori-haswell_gnu
SMS.T62_oQU120_ais20.MPAS_LISIO_TEST.cori-haswell_intel
PET_Ln3.T62_oEC60to30v3wLI.GMPAS-DIB-IAF-ISMF.cori-haswell_intel
PET_Ln3.T62_oEC60to30v3wLI.GMPAS-DIB-IAF-ISMF.cori-haswell_gnu
PEM_Ln9.T62_oQU240.GMPAS-IAF.cori-haswell_gnu
PEM_Ln9.T62_oQU240.GMPAS-IAF.cori-haswell_intel
PET_Ln9.T62_oQU240.GMPAS-IAF.cori-knl_intel
SMS_P256x2.T62_oEC60to30v3.CMPASO-NYF.summit_pgi

So I think this is ready to go into E3SM.

@mark-petersen mark-petersen force-pushed the ocean/tracer_transport_optimization branch from 552d105 to a6bd075 Compare September 16, 2020 21:49
@mark-petersen
Copy link
Contributor

Rebased to be safe. Then tested nightly regression suite. Passes all with gnu 6 debug on badger, intel 19 debug on grizzly, and is bfb in optimized with gnu 6 and intel 19 against same compilers with branch ocean/develop.

I also looked through the code. This all looks correct. Thanks @mattdturner for your work. I love the simplification of obtaining mesh variables from the module.

@mark-petersen mark-petersen self-assigned this Sep 16, 2020
@mark-petersen
Copy link
Contributor

@jonbob this is ready and will need it's own E3SM PR. We can merge this PR when we have a free slot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants