Ocean: Remove scratch allocates and calls to get_config#457
Conversation
|
Testing was done on Cori with the Intel compiler ( Baseline - optimized Baseline - debug This PR - optimized This PR - debug |
|
The The optimized failures for Global Ocean in this PR are a result of non-bfb results. However, the fact that the debug tests are bfb suggests that the non-bfb for the optimized tests are a result of additional optimization that can be done by the compiler as a result of the changes in this PR. |
218cc39 to
83bbe1f
Compare
|
I just rebased to resolve the merge conflicts (and fixed a bug in the new code) |
|
Thanks, I'll check it out.
Tried to test the threading case on Summit but threading with PGI is still broken there. Will try an Intel or gnu build instead. Wanted to use a different machine to make sure the changes worked everywhere...
Phil
TSPA/Correspondence/DUSA PLNT
------
Philip Jones (pwjones@lanl.gov)
Climate, Ocean and Sea Ice Modeling
Los Alamos National Laboratory
T-3 MS B216
P.O. Box 1663
Los Alamos, NM 87545
…________________________________
From: Matthew Turner <notifications@github.com>
Sent: Wednesday, April 1, 2020 2:34:11 PM
To: MPAS-Dev/MPAS-Model
Cc: Jones, Phil; Mention
Subject: [EXTERNAL] Re: [MPAS-Dev/MPAS-Model] Ocean: Remove scratch allocates and calls to get_config (#457)
I just rebased to resolve the merge conflicts
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#457 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB6LSTIWYDRGGRQA33NGC2TRKOQMHANCNFSM4K5BOPFA>.
|
philipwjones
left a comment
There was a problem hiding this comment.
Confirmed bit-for-bit in debug mode and apparently round-off level changes in optimized version. Using PGI on Summit in QU240 configuration. Also visual inspection and earlier testing by @mattdturner . Performance improvement in the noise - disappointing since it had a larger effect in tracer advection - but will aid further threading improvements later.
|
@mattdturner I'm rebasing, testing, and merging this week. Since this PR touches so many parts of the code, I plan to repackage this into three smaller PRs:
I will leave this PR in tact, as the diffs will show the changes remaining against Do you mind if I squash your commits on this PR, and then make separate commits for each of those items? The code will remain identical, but it allows me to cherry-pick the new commits. There may just be a few lines to separate in |
That's not a problem. Let me know if there is anything I can do to help. |
83bbe1f to
ab76eac
Compare
Merge the changes in PR#457 for mpas_ocn_diagnostics.F that remove scratch allocates and the calls to mpas_pool_get_config.
ab76eac to
4de1b4d
Compare
|
@mattdturner thanks for the updates. I ran these through the MPAS-Ocean nightly regression suite on grizzly. For a straight test (no comparisons) I get:
I then compared to commit 82eb734 on ocean/develop, before these changes (just before PR #539 and #553). I get:
Since the comparisons vary by compiler, I'm guessing these are differences in optimizations in that intel 19 test. I'll compare debug next. |
|
@mattdturner I've been doing more testing. First, I compared the repo between 82eb734 (before PR #539 and #553) to f0865d7 (current head, just after those two PRs). It passes the nightly regression suite and is bfb between before/after on all three compilers, optimized (gnu, intel17, intel19). So it looks like everything is OK with the current head of ocean/develop. I then reran with the current head of this PR, 4de1b4d. On intel 19, the failed bfb comparison variables are all 1e-13, so that comparison is OK. I'll keep testing on those other couple of problems above. |
|
With that last commit I can pass the |
Delete the delsq_tracer variable group from Registry.xml for ocean model
Intel v19 complained about some pointer assignments using the `=` instead of the `=>`.
Revert mode_init and analysis_members files to ocean/develop (already has scratch allocate code removed). A few bugfixes that came about during rebasing.
Remove unneeded scratch variables from Registry.xml in core_ocean. Also remove unused variables from some of the files in shared/
20ce297 to
e0877d6
Compare
…(PR #3597) Ocean: Reduce pointer retrievals in shared directory This PR brings in a new mpas-source submodule with changes only to the ocean core. These changes are intended to improve performance by reducing pointer retrievals. This includes scratch allocates and config pointers. We are taking a staged approach, so this PR only alters the src/core_ocean/shared directory. See similar changes for analysis and init in previous PR, #3717. Here, we: * replace almost all calls to mpas_allocate_scratch_field in the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones; and * replace almost all calls to mpas_pool_get_config in subroutines with definitions in ocn_config via use ocn_config. This PR is labelled as non-BFB, because there could be machine-precision changes that accumulate. In MPAS-Ocean stand-alone testing, comparisons compiled in intel debug, gnu debug, and gnu optimized are all BFB. Comparisons with intel optimized had small differences. See MPAS-Dev/MPAS-Model#457 by @mattdturner [non-BFB]
Ocean: Reduce pointer retrievals in shared directory This PR brings in a new mpas-source submodule with changes only to the ocean core. These changes are intended to improve performance by reducing pointer retrievals. This includes scratch allocates and config pointers. We are taking a staged approach, so this PR only alters the src/core_ocean/shared directory. See similar changes for analysis and init in previous PR, #3717. Here, we: * replace almost all calls to mpas_allocate_scratch_field in the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones; and * replace almost all calls to mpas_pool_get_config in subroutines with definitions in ocn_config via use ocn_config. This PR is labelled as non-BFB, because there could be machine-precision changes that accumulate. In MPAS-Ocean stand-alone testing, comparisons compiled in intel debug, gnu debug, and gnu optimized are all BFB. Comparisons with intel optimized had small differences. See MPAS-Dev/MPAS-Model#457 by @mattdturner [non-BFB]
…n/develop This was removed in #447 but inadvertently got added back in in #457. It should be the first step toward addressing E3SM-Project/E3SM#3797
…rs' into ocean/develop Remove scratch allocates and config. Part 1: analysis members MPAS-Dev#539 Part 1 of MPAS-Dev#457.
…nto ocean/develop This was removed in MPAS-Dev#447 but inadvertently got added back in in MPAS-Dev#457. It should be the first step toward addressing E3SM-Project/E3SM#3797
This PR replaces almost all calls to
mpas_allocate_scratch_fieldin the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones .This PR also replaces almost all calls to
mpas_pool_get_configin subroutines with definitions inocn_configviause ocn_config. This addresses #394 , and allows #406 to be closed without being merged.There are still a few calls to
mpas_allocate_scratch_fieldthat cannot be replaced until more of the MPAS framework rewrite has been completed (namely removal ofblocks).This PR only changes the files in the shared subdirectory, not the mode_forward subdirectory.