Ocean: Remove scratch allocates and calls to get_config by mattdturner · Pull Request #457 · MPAS-Dev/MPAS-Model

mattdturner · 2020-02-27T19:17:26Z

This PR replaces almost all calls to mpas_allocate_scratch_field in the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones .

This PR also replaces almost all calls to mpas_pool_get_config in subroutines with definitions in ocn_config via use ocn_config. This addresses #394 , and allows #406 to be closed without being merged.

There are still a few calls to mpas_allocate_scratch_field that cannot be replaced until more of the MPAS framework rewrite has been completed (namely removal of blocks).

This PR only changes the files in the shared subdirectory, not the mode_forward subdirectory.

mattdturner · 2020-02-27T19:26:21Z

Testing was done on Cori with the Intel compiler (ifort (IFORT) 19.0.3.199 20190206). Testing was done for both an optimized model and a debug model (with -O0 added to compilation flags). The Nightly test suite (with a few tests removed, such as the block tests which FAILed for the baseline) was run for both a baseline and the code in this PR:

Baseline - optimized

 ** Running case Global Ocean 240km - Init Test
      PASS
 ** Running case Global Ocean 240km - Performance Test
      PASS
 ** Running case Global Ocean 240km - Restart Test
      PASS
 ** Running case Global Ocean 240km - Analysis Test
      PASS
 ** Running case ZISO 20km - Smoke Test
      PASS
 ** Running case ZISO 20km - Smoke Test with frazil
      PASS
 ** Running case Baroclinic Channel 10km - Thread Test
      PASS
 ** Running case Baroclinic Channel 10km - Decomp Test
      PASS
 ** Running case Baroclinic Channel 10km - Restart Test
      PASS
 ** Running case sub-ice-shelf 2D - restart test
      PASS
TEST RUNTIMES:
11:31 Baroclinic_Channel_10km_-_Decomp_Test
11:10 Baroclinic_Channel_10km_-_Restart_Test
06:27 Baroclinic_Channel_10km_-_Thread_Test
16:35 Global_Ocean_240km_-_Analysis_Test
12:22 Global_Ocean_240km_-_Init_Test
05:23 Global_Ocean_240km_-_Performance_Test
16:18 Global_Ocean_240km_-_Restart_Test
13:33 ZISO_20km_-_Smoke_Test
05:18 ZISO_20km_-_Smoke_Test_with_frazil
24:44 sub-ice-shelf_2D_-_restart_test
Total runtime 123:21

Baseline - debug

 ** Running case Global Ocean 240km - Init Test
      PASS
 ** Running case Global Ocean 240km - Performance Test
      PASS
 ** Running case Global Ocean 240km - Restart Test
      PASS
 ** Running case Global Ocean 240km - Analysis Test
      PASS
 ** Running case ZISO 20km - Smoke Test
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test for more information)
 ** Running case ZISO 20km - Smoke Test with frazil
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test_with_frazil for more information)
 ** Running case Baroclinic Channel 10km - Thread Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Thread_Test for more information)
 ** Running case Baroclinic Channel 10km - Decomp Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Decomp_Test for more information)
 ** Running case Baroclinic Channel 10km - Restart Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Restart_Test for more information)
 ** Running case sub-ice-shelf 2D - restart test
   ** FAIL (See case_outputs/sub-ice-shelf_2D_-_restart_test for more information)
TEST RUNTIMES:
01:35 Baroclinic_Channel_10km_-_Decomp_Test
01:28 Baroclinic_Channel_10km_-_Restart_Test
01:40 Baroclinic_Channel_10km_-_Thread_Test
17:41 Global_Ocean_240km_-_Analysis_Test
13:00 Global_Ocean_240km_-_Init_Test
04:17 Global_Ocean_240km_-_Performance_Test
17:45 Global_Ocean_240km_-_Restart_Test
11:48 ZISO_20km_-_Smoke_Test
02:00 ZISO_20km_-_Smoke_Test_with_frazil
01:50 sub-ice-shelf_2D_-_restart_test
Total runtime 73:04

This PR - optimized

 ** Running case Global Ocean 240km - Init Test
      PASS
 ** Running case Global Ocean 240km - Performance Test
   ** FAIL (See case_outputs/Global_Ocean_240km_-_Performance_Test for more information)
 ** Running case Global Ocean 240km - Restart Test
   ** FAIL (See case_outputs/Global_Ocean_240km_-_Restart_Test for more information)
 ** Running case Global Ocean 240km - Analysis Test
   ** FAIL (See case_outputs/Global_Ocean_240km_-_Analysis_Test for more information)
 ** Running case ZISO 20km - Smoke Test
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test for more information)
 ** Running case ZISO 20km - Smoke Test with frazil
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test_with_frazil for more information)
 ** Running case Baroclinic Channel 10km - Thread Test
      PASS
 ** Running case Baroclinic Channel 10km - Decomp Test
      PASS
 ** Running case Baroclinic Channel 10km - Restart Test
      PASS
 ** Running case sub-ice-shelf 2D - restart test
   ** FAIL (See case_outputs/sub-ice-shelf_2D_-_restart_test for more information)
TEST RUNTIMES:
11:31 Baroclinic_Channel_10km_-_Decomp_Test
11:08 Baroclinic_Channel_10km_-_Restart_Test
06:31 Baroclinic_Channel_10km_-_Thread_Test
17:27 Global_Ocean_240km_-_Analysis_Test
12:08 Global_Ocean_240km_-_Init_Test
04:24 Global_Ocean_240km_-_Performance_Test
16:54 Global_Ocean_240km_-_Restart_Test
00:07 ZISO_20km_-_Smoke_Test
00:07 ZISO_20km_-_Smoke_Test_with_frazil
00:31 sub-ice-shelf_2D_-_restart_test
Total runtime 80:48

This PR - debug

 ** Running case Global Ocean 240km - Init Test
      PASS
 ** Running case Global Ocean 240km - Performance Test
      PASS
 ** Running case Global Ocean 240km - Restart Test
      PASS
 ** Running case Global Ocean 240km - Analysis Test
   ** FAIL (See case_outputs/Global_Ocean_240km_-_Analysis_Test for more information)
 ** Running case ZISO 20km - Smoke Test
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test for more information)
 ** Running case ZISO 20km - Smoke Test with frazil
   ** FAIL (See case_outputs/ZISO_20km_-_Smoke_Test_with_frazil for more information)
 ** Running case Baroclinic Channel 10km - Thread Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Thread_Test for more information)
 ** Running case Baroclinic Channel 10km - Decomp Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Decomp_Test for more information)
 ** Running case Baroclinic Channel 10km - Restart Test
   ** FAIL (See case_outputs/Baroclinic_Channel_10km_-_Restart_Test for more information)
 ** Running case sub-ice-shelf 2D - restart test
   ** FAIL (See case_outputs/sub-ice-shelf_2D_-_restart_test for more information)
TEST RUNTIMES:
01:37 Baroclinic_Channel_10km_-_Decomp_Test
01:36 Baroclinic_Channel_10km_-_Restart_Test
01:45 Baroclinic_Channel_10km_-_Thread_Test
19:58 Global_Ocean_240km_-_Analysis_Test
12:37 Global_Ocean_240km_-_Init_Test
04:26 Global_Ocean_240km_-_Performance_Test
18:23 Global_Ocean_240km_-_Restart_Test
00:20 ZISO_20km_-_Smoke_Test
00:19 ZISO_20km_-_Smoke_Test_with_frazil
00:08 sub-ice-shelf_2D_-_restart_test
Total runtime 61:09

mattdturner · 2020-02-27T19:29:48Z

The debug failures in both baseline and this PR for ZISO, Baroclinic Channel, and sub-ice-shelf all have the following error report:

forrtl: severe (408): fort: (7): Attempt to use pointer REDIKAPPA when it is not associated with a target

Image              PC                Routine            Line        Source
ocean_model        0000000003251CE6  Unknown               Unknown  Unknown
ocean_model        0000000001F47AB6  ocn_tendency_mp_o         869  mpas_ocn_tendency.F
ocean_model        0000000001BBB1BB  ocn_time_integrat        1504  mpas_ocn_time_integration_split.F
ocean_model        0000000001B84B98  ocn_time_integrat         110  mpas_ocn_time_integration.F
ocean_model        0000000001B846FD  ocn_forward_mode_         613  mpas_ocn_forward_mode.F
libiomp5.so        00002AAAADCDC9F3  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        00002AAAADC9D5B2  __kmp_fork_call       Unknown  Unknown
libiomp5.so        00002AAAADC5DD60  __kmpc_fork_call      Unknown  Unknown
ocean_model        0000000001B8310B  ocn_forward_mode_         611  mpas_ocn_forward_mode.F
ocean_model        0000000001B7AC73  ocn_core_mp_ocn_c         111  mpas_ocn_core.F
ocean_model        0000000000417C74  mpas_subdriver_mp         347  mpas_subdriver.F
ocean_model        0000000000412CA7  MAIN__                     16  mpas.F
ocean_model        0000000000412C12  Unknown               Unknown  Unknown
libc-2.26.so       00002AAAAE208F8A  __libc_start_main     Unknown  Unknown
ocean_model        0000000000412B2A  Unknown               Unknown  Unknown

The optimized failures for Global Ocean in this PR are a result of non-bfb results. However, the fact that the debug tests are bfb suggests that the non-bfb for the optimized tests are a result of additional optimization that can be done by the compiler as a result of the changes in this PR.

mattdturner · 2020-04-01T20:33:57Z

I just rebased to resolve the merge conflicts (and fixed a bug in the new code)

philipwjones · 2020-04-01T20:40:25Z

Thanks, I'll check it out. Tried to test the threading case on Summit but threading with PGI is still broken there. Will try an Intel or gnu build instead. Wanted to use a different machine to make sure the changes worked everywhere... Phil TSPA/Correspondence/DUSA PLNT ------ Philip Jones (pwjones@lanl.gov) Climate, Ocean and Sea Ice Modeling Los Alamos National Laboratory T-3 MS B216 P.O. Box 1663 Los Alamos, NM 87545

…

________________________________ From: Matthew Turner <notifications@github.com> Sent: Wednesday, April 1, 2020 2:34:11 PM To: MPAS-Dev/MPAS-Model Cc: Jones, Phil; Mention Subject: [EXTERNAL] Re: [MPAS-Dev/MPAS-Model] Ocean: Remove scratch allocates and calls to get_config (#457) I just rebased to resolve the merge conflicts — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#457 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB6LSTIWYDRGGRQA33NGC2TRKOQMHANCNFSM4K5BOPFA>.

philipwjones

Confirmed bit-for-bit in debug mode and apparently round-off level changes in optimized version. Using PGI on Summit in QU240 configuration. Also visual inspection and earlier testing by @mattdturner . Performance improvement in the noise - disappointing since it had a larger effect in tracer advection - but will aid further threading improvements later.

mark-petersen · 2020-04-27T15:25:48Z

@mattdturner I'm rebasing, testing, and merging this week. Since this PR touches so many parts of the code, I plan to repackage this into three smaller PRs:

analysis_members
init_mode
remainder (may split further if there are problems)

I will leave this PR in tact, as the diffs will show the changes remaining against ocean/develop branch as I merge the smaller PRs.

Do you mind if I squash your commits on this PR, and then make separate commits for each of those items?

The code will remain identical, but it allows me to cherry-pick the new commits. There may just be a few lines to separate in Registry.xml, but otherwise it should be a clean separation. It's easier for me to do it, since I'm testing at the same time.

mattdturner · 2020-04-27T15:28:54Z

Do you mind if I squash your commits on this PR, and then make separate commits for each of those items?

That's not a problem. Let me know if there is anything I can do to help.

Merge the changes in PR#457 for mpas_ocn_diagnostics.F that remove scratch allocates and the calls to mpas_pool_get_config.

…develop Remove scratch allocates and config. Part 2: Init mode #553 Part 2 of #457.

… ocean/develop Remove scratch allocates and config. Part 1: analysis members #539 Part 1 of #457.

mark-petersen · 2020-07-15T20:57:13Z

@mattdturner thanks for the updates. I ran these through the MPAS-Ocean nightly regression suite on grizzly. For a straight test (no comparisons) I get:

gnu debug: PASS on all tests
gnu optimized: PASS on all tests
intel 17 debug: PASS on all tests
intel 17 optimized: PASS on all tests except Global_Ocean_240km_-_BGC_Ecosys_Test (dies during run, probably init mode creates a bad file)
intel 19 debug: PASS on all tests
intel 19 optimized: PASS on all tests

I then compared to commit 82eb734 on ocean/develop, before these changes (just before PR #539 and #553). I get:

gnu optimized: PASS all comparisons, except Global_Ocean_240km_-_Analysis_Test, mismatch in variable tThreshMLD (update: this mismatch disappeared in latest testing. I'm not sure why.)
intel 17 optimized: PASS all comparisons, except Global_Ocean_240km_-_BGC_Ecosys_Test fails to run
intel 19 optimized: comparisons FAIL on most of the tests. (Update: these are all machine precision at 1e-13)

Since the comparisons vary by compiler, I'm guessing these are differences in optimizations in that intel 19 test. I'll compare debug next.

mark-petersen · 2020-07-19T17:01:06Z

@mattdturner I've been doing more testing. First, I compared the repo between 82eb734 (before PR #539 and #553) to f0865d7 (current head, just after those two PRs). It passes the nightly regression suite and is bfb between before/after on all three compilers, optimized (gnu, intel17, intel19). So it looks like everything is OK with the current head of ocean/develop.

I then reran with the current head of this PR, 4de1b4d. On intel 19, the failed bfb comparison variables are all 1e-13, so that comparison is OK. I'll keep testing on those other couple of problems above.

mark-petersen · 2020-07-20T04:54:46Z

With that last commit I can pass the Global_Ocean_240km_-_BGC_Ecosys_Test. The atmospheric pressure from init mode was not specified. Some compilers set it to zero, and some set it to random stuff. So I forced it to be zero.

Delete the delsq_tracer variable group from Registry.xml for ocean model

Intel v19 complained about some pointer assignments using the `=` instead of the `=>`.

Revert mode_init and analysis_members files to ocean/develop (already has scratch allocate code removed). A few bugfixes that came about during rebasing.

Remove unneeded scratch variables from Registry.xml in core_ocean. Also remove unused variables from some of the files in shared/

@philipwjones

…(PR #3597) Ocean: Reduce pointer retrievals in shared directory This PR brings in a new mpas-source submodule with changes only to the ocean core. These changes are intended to improve performance by reducing pointer retrievals. This includes scratch allocates and config pointers. We are taking a staged approach, so this PR only alters the src/core_ocean/shared directory. See similar changes for analysis and init in previous PR, #3717. Here, we: * replace almost all calls to mpas_allocate_scratch_field in the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones; and * replace almost all calls to mpas_pool_get_config in subroutines with definitions in ocn_config via use ocn_config. This PR is labelled as non-BFB, because there could be machine-precision changes that accumulate. In MPAS-Ocean stand-alone testing, comparisons compiled in intel debug, gnu debug, and gnu optimized are all BFB. Comparisons with intel optimized had small differences. See MPAS-Dev/MPAS-Model#457 by @mattdturner [non-BFB]

@philipwjones

Ocean: Reduce pointer retrievals in shared directory This PR brings in a new mpas-source submodule with changes only to the ocean core. These changes are intended to improve performance by reducing pointer retrievals. This includes scratch allocates and config pointers. We are taking a staged approach, so this PR only alters the src/core_ocean/shared directory. See similar changes for analysis and init in previous PR, #3717. Here, we: * replace almost all calls to mpas_allocate_scratch_field in the ocean model with straight Fortran allocates. This is done as part of the MPAS framework rewrite, per @philipwjones; and * replace almost all calls to mpas_pool_get_config in subroutines with definitions in ocn_config via use ocn_config. This PR is labelled as non-BFB, because there could be machine-precision changes that accumulate. In MPAS-Ocean stand-alone testing, comparisons compiled in intel debug, gnu debug, and gnu optimized are all BFB. Comparisons with intel optimized had small differences. See MPAS-Dev/MPAS-Model#457 by @mattdturner [non-BFB]

…n/develop This was removed in #447 but inadvertently got added back in in #457. It should be the first step toward addressing E3SM-Project/E3SM#3797

…rs' into ocean/develop Remove scratch allocates and config. Part 1: analysis members MPAS-Dev#539 Part 1 of MPAS-Dev#457.

…nto ocean/develop This was removed in MPAS-Dev#447 but inadvertently got added back in in MPAS-Dev#457. It should be the first step toward addressing E3SM-Project/E3SM#3797

mattdturner requested review from mark-petersen and philipwjones February 27, 2020 19:29

philipwjones added Ocean performance labels Mar 5, 2020

mattdturner force-pushed the ocean/remove_scratch_allocate branch from 218cc39 to 83bbe1f Compare April 1, 2020 20:33

philipwjones approved these changes Apr 2, 2020

View reviewed changes

This was referenced Apr 14, 2020

Ocean/tracer advection optimization #519

Closed

Diagnostics: Move mpas_pool_get_config to init #406

Closed

mark-petersen force-pushed the ocean/remove_scratch_allocate branch from 83bbe1f to ab76eac Compare April 27, 2020 15:47

mattdturner mentioned this pull request Apr 27, 2020

Optimizations for tracer horizontal mixing #538

Closed

mark-petersen mentioned this pull request Apr 27, 2020

Remove scratch allocates and config. Part 1: analysis members #539

Merged

mark-petersen mentioned this pull request May 9, 2020

Remove scratch allocates and config. Part 2: Init mode #553

Merged

mark-petersen self-assigned this May 19, 2020

mark-petersen mentioned this pull request May 26, 2020

Ocean: Reduce pointer retrievals in shared directory E3SM-Project/E3SM#3597

Merged

mark-petersen mentioned this pull request Jun 4, 2020

Fixes bugs in tracer_advection_std #587

Merged

mattdturner added a commit to mattdturner/MPAS-Model that referenced this pull request Jun 23, 2020

Make changes to diagnostics from PR MPAS-Dev#457

dd2e2f0

Merge the changes in PR#457 for mpas_ocn_diagnostics.F that remove scratch allocates and the calls to mpas_pool_get_config.

mark-petersen mentioned this pull request Jul 9, 2020

Codes for a semi implicit barotropic mode solver #422

Merged

mark-petersen added a commit that referenced this pull request Jul 14, 2020

Merge PR #553 'ocean/remove_scratch_allocate_2_init_mode' into ocean/…

414f249

…develop Remove scratch allocates and config. Part 2: Init mode #553 Part 2 of #457.

mark-petersen added a commit that referenced this pull request Jul 14, 2020

Merge PR #539 'ocean/remove_scratch_allocate_1_analysis_members' into…

f0865d7

… ocean/develop Remove scratch allocates and config. Part 1: analysis members #539 Part 1 of #457.

mattdturner force-pushed the ocean/remove_scratch_allocate branch from ab76eac to 4de1b4d Compare July 15, 2020 00:29

mattdturner and others added 18 commits August 4, 2020 16:56

Clean our Registry.xml

8cda5f9

Delete the delsq_tracer variable group from Registry.xml for ocean model

Revert mode_init and analysis_members changes

2f9388f

Resolve additional merge conflicts

68b631a

Replace = with => for pointers in eqn of state

a108011

Intel v19 complained about some pointer assignments using the `=` instead of the `=>`.

Add back Registry variables

88dd08b

Bugfixes after rebasing

26a4788

Revert mode_init and analysis_members files to ocean/develop (already has scratch allocate code removed). A few bugfixes that came about during rebasing.

Change atmosphericPressure to default 0.0

fcb43a7

Revert mode_forward/ and Registry.xml changes

bab5758

Bugfix after rebase

1a7a775

Add comments to recent bugfix

950575c

Make sure nEdges is defined before allocates

cd11746

Add private variables in thick_ale

82bc4ed

Change maxLevelEdgeBot to maxLevelEdgeTop on flux calculations

a48645f

Remove nVertLevel arrays from openmp private

296d074

Remove SSH_ALE_thickness variable

cd7c385

Add private index variables

f198e6f

Change three variables back to scratch

9501f53

Remove scratch variables from Registry

e0877d6

Remove unneeded scratch variables from Registry.xml in core_ocean. Also remove unused variables from some of the files in shared/

mark-petersen force-pushed the ocean/remove_scratch_allocate branch from 20ce297 to e0877d6 Compare August 4, 2020 23:10

mark-petersen merged commit ceac09d into MPAS-Dev:ocean/develop Aug 10, 2020

darincomeau mentioned this pull request Aug 27, 2020

Domain incompatability with new ECwISC30to60E1r2 grid E3SM-Project/E3SM#3797

Closed

xylar mentioned this pull request Aug 28, 2020

Remove landIceMask from diagnostics computation #672

Merged

mattdturner deleted the ocean/remove_scratch_allocate branch March 16, 2021 16:43

matthewhoffman added Ocean performance labels Mar 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ocean: Remove scratch allocates and calls to get_config#457

Ocean: Remove scratch allocates and calls to get_config#457
mark-petersen merged 36 commits intoMPAS-Dev:ocean/developfrom
mattdturner:ocean/remove_scratch_allocate

mattdturner commented Feb 27, 2020 •

edited by mark-petersen

Loading

Uh oh!

mattdturner commented Feb 27, 2020

Uh oh!

mattdturner commented Feb 27, 2020

Uh oh!

mattdturner commented Apr 1, 2020 •

edited

Loading

Uh oh!

philipwjones commented Apr 1, 2020 via email

Uh oh!

philipwjones left a comment

Uh oh!

mark-petersen commented Apr 27, 2020

Uh oh!

mattdturner commented Apr 27, 2020

Uh oh!

mark-petersen commented Jul 15, 2020 •

edited

Loading

Uh oh!

mark-petersen commented Jul 19, 2020

Uh oh!

mark-petersen commented Jul 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

mattdturner commented Feb 27, 2020 • edited by mark-petersen Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattdturner commented Feb 27, 2020

Uh oh!

mattdturner commented Feb 27, 2020

Uh oh!

mattdturner commented Apr 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

philipwjones commented Apr 1, 2020 via email

Uh oh!

philipwjones left a comment

Choose a reason for hiding this comment

Uh oh!

mark-petersen commented Apr 27, 2020

Uh oh!

mattdturner commented Apr 27, 2020

Uh oh!

mark-petersen commented Jul 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mark-petersen commented Jul 19, 2020

Uh oh!

mark-petersen commented Jul 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mattdturner commented Feb 27, 2020 •

edited by mark-petersen

Loading

mattdturner commented Apr 1, 2020 •

edited

Loading

mark-petersen commented Jul 15, 2020 •

edited

Loading