Skip to content

New ocean mesh structure with GPU replication#496

Merged
mark-petersen merged 4 commits intoMPAS-Dev:ocean/developfrom
philipwjones:ocean/gpumesh
May 21, 2020
Merged

New ocean mesh structure with GPU replication#496
mark-petersen merged 4 commits intoMPAS-Dev:ocean/developfrom
philipwjones:ocean/gpumesh

Conversation

@philipwjones
Copy link
Contributor

@philipwjones philipwjones commented Mar 30, 2020

This change adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. It is a critical piece of infrastructure needed for future GPU modifications being staged. In a later change, this structure was replaced by public module variables instead since a large user-defined type did not perform well on GPUs. Mesh variables can now be accessed just by "use"-ing this module.

This module also has a revised index naming in which nCells, nCellsSolve and nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the latter differs from nCellsArray by only including the index of each halo depth and no longer includes nCellsOwned as the first entry. Similar changes were introduced for Edges, Vertices.

NOTE: this change requires only one block per MPI task and will abort if more than one block per MPI task is attempted. Any multi-block tests will fail.

This has been tested on summit in a QU240 test case and is bit-for-bit since the structure itself is not yet being used anywhere. However, I have verified the data is correctly transferred to the GPU accelerator on Summit.

[b4b]

Copy link
Collaborator

@mattdturner mattdturner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look through this today.

@mattdturner
Copy link
Collaborator

Approved per visual inspection and @philipwjones testing.

@philipwjones
Copy link
Contributor Author

While resolving the redundant OpenACC ifdef, found some debug output that hadn't been removed and realized I hadn't added this file to the Cmake build, so just committed/pushed those mods while also removing the redundant ifdef

@philipwjones
Copy link
Contributor Author

Pushed a bug fix for bad dimensions on two edgeSign arrays

@mattdturner
Copy link
Collaborator

Should there be a nCellsSolve variable added to the mesh type? Some routines pull the nCellsSolve variable from meshPool. (e.g., https://github.com/MPAS-Dev/MPAS-Model/blob/master/src/core_ocean/shared/mpas_ocn_tracer_advection_std.F#L95)

call mpas_pool_get_dimension(meshPool, 'nCellsSolve', nCellsSolve)

@philipwjones
Copy link
Contributor Author

Thanks @mattdturner I also caught another one (nVerticesArray) and there are probably some more. I caught everything that I found documented in Registry and the struct include file, but there are clearly a number of fields in the dimension subpool that I didn't pick up. This stuff is so opaque, I can't find a place with a definitive list so might have to resort to a massive grep...

@mattdturner
Copy link
Collaborator

This stuff is so opaque, I can't find a place with a definitive list so might have to resort to a massive grep...

As we work through updating the files for the GPU-related work, any variable that was missed will make itself known.

@philipwjones
Copy link
Contributor Author

Added the relevant missing variables discussed above. However, I have renamed/refactored them a bit. So instead of nCells, nCellsSolve and nCellsArray, the new structure has nCellsAll, nCellsOwned, and nCellsHalo(n) where nCellsHalo refers only to the halo levels, i.e. nCellsHalo(1) has the number of owned+halo cells in the first halo level. The old nCellsArray had the first entry as nCellsOwned/Solve so this is a conceptual change. Similar constructs for Edges and Vertices have been introduced. The old names/structures still exist in the old mesh pool.

@philipwjones
Copy link
Contributor Author

Just committed a new version that eliminates the structure in favor of public mesh variables that are accessed with a module use. The prior structure did not perform well on GPU due to the need to traverse such a large structure to find variables. I also squashed some of the prior commits for bug fixes so that this is a little cleaner. I will update the PR text above to better reflect these latest changes.

Comment on lines 622 to 625
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still references the ocnMesh derived type, which has been removed.

@mattdturner
Copy link
Collaborator

Should the ocn_mesh module include the verticalMeshPool variables as well? The vertical pool only contains:

  • restingThickness
  • refZMid
  • refLayerThickness

@philipwjones
Copy link
Contributor Author

These reference arrays weren't used that often since they only refer to the reference values and not the time-variable depths. But happy to include them if needed.

@mattdturner
Copy link
Collaborator

I'm not sure if its totally necessary. I just noticed that the verticalMeshPool is passed as an argument to ocn_ALE_thickness and wasn't sure if those variables not being included in ocn_mesh was an oversight.

@mark-petersen mark-petersen self-assigned this May 19, 2020
philipwjones and others added 3 commits May 19, 2020 08:33
This change adds a mesh structure with all current mesh pool
data and replicates that data on a GPU (or other accelerator)
for the duration of the simulation to reduce the need for data
transfers of static mesh data.

The module also redefines some index variables related to halos
and owned cells. Rather than nCells, nCellsSolve and nCellsArray,
the new mesh structure now has nCellsAll, nCellsOwned and nCellsHalo(n)
with similar variables for Edges and Vertices. Unlike nCellsArray,
the nCellsHalo only includes the final index for each halo depth n,
(the prior nCellsArray variable included nCellsOwned as first entry).

NOTE: this change requires only one block per node and will
abort if more than one block per node is attempted.
  original mesh structure proved not to be performant on GPU so
  switched to accessing mesh variables as public module variables instead
@mark-petersen
Copy link
Contributor

Rebased, passes nightly regression suite with gnu and intel, debug and optimized. bfb with previous, except block tests with 2 blocks/core fail, as expected.

@mark-petersen mark-petersen self-requested a review May 21, 2020 22:02
mark-petersen added a commit that referenced this pull request May 21, 2020
* ocean/develop:
  New ocean mesh structure with GPU replication #496
Copy link
Contributor

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passes the following in E3SM:

PET_Ln9.T62_oQU240.GMPAS-IAF.cori-knl_gnu
PEM_Ln9.T62_oQU240.GMPAS-IAF.cori-knl_intel
PET_Ln3.T62_oEC60to30v3wLI.GMPAS-DIB-IAF-ISMF.cori-knl_intel

@mark-petersen mark-petersen merged commit 80d2f30 into MPAS-Dev:ocean/develop May 21, 2020
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 26, 2020
New ocean mesh structure with GPU replication

This PR brings in a new mpas-source submodule with changes only to the ocean
core. It adds a mesh structure with all current mesh pool data and replicates
that data on a GPU (or other accelerator) for the duration of the simulation to
reduce the need for data transfers of static mesh data. It is a critical piece of
infrastructure needed for future GPU modifications being staged. In a later
change, this structure was replaced by public module variables instead since a
large user-defined type did not perform well on GPUs. Mesh variables can now be
accessed just by "use"-ing this module.

This module also has a revised index naming in which nCells, nCellsSolve and
nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the
latter differs from nCellsArray by only including the index of each halo depth
and no longer includes nCellsOwned as the first entry. Similar changes were
introduced for Edges, Vertices.

NOTE: this change requires only one block per MPI task and will abort if more
than one block per MPI task is attempted. Any multi-block tests will fail.

This has been tested on summit in a QU240 test case and is bit-for-bit since the
structure itself is not yet being used anywhere. However, I have verified the
data is correctly transferred to the GPU accelerator on Summit.

See MPAS-Dev/MPAS-Model#496 by @philipwjones

[BFB]
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request May 27, 2020
New ocean mesh structure with GPU replication

This PR brings in a new mpas-source submodule with changes only to the ocean
core. It adds a mesh structure with all current mesh pool data and replicates
that data on a GPU (or other accelerator) for the duration of the simulation to
reduce the need for data transfers of static mesh data. It is a critical piece of
infrastructure needed for future GPU modifications being staged. In a later
change, this structure was replaced by public module variables instead since a
large user-defined type did not perform well on GPUs. Mesh variables can now be
accessed just by "use"-ing this module.

This module also has a revised index naming in which nCells, nCellsSolve and
nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the
latter differs from nCellsArray by only including the index of each halo depth
and no longer includes nCellsOwned as the first entry. Similar changes were
introduced for Edges, Vertices.

NOTE: this change requires only one block per MPI task and will abort if more
than one block per MPI task is attempted. Any multi-block tests will fail.

This has been tested on summit in a QU240 test case and is bit-for-bit since the
structure itself is not yet being used anywhere. However, I have verified the
data is correctly transferred to the GPU accelerator on Summit.

See MPAS-Dev/MPAS-Model#496 by @philipwjones

[BFB]
mark-petersen added a commit that referenced this pull request Sep 9, 2020
 Change block tests to partition tests for QU240 RK4 and SE #657

 As of #496, MPAS-Ocean no longer supports multiple blocks.
@philipwjones philipwjones deleted the ocean/gpumesh branch January 8, 2021 21:00
caozd999 pushed a commit to caozd999/MPAS-Model that referenced this pull request Jan 14, 2021
New ocean mesh structure with GPU replication MPAS-Dev#496

This change adds a mesh structure with all current mesh pool data and
replicates that data on a GPU (or other accelerator) for the duration of
the simulation to reduce the need for data transfers of static mesh
data. It is a critical piece of infrastructure needed for future GPU
modifications being staged. In a later change, this structure was
replaced by public module variables instead since a large user-defined
type did not perform well on GPUs. Mesh variables can now be accessed
just by "use"-ing this module.

This module also has a revised index naming in which nCells, nCellsSolve
and nCellsArray are replaced by nCellsAll, nCellsOwned, and
nCellsHalo(n) where the latter differs from nCellsArray by only
including the index of each halo depth and no longer includes
nCellsOwned as the first entry. Similar changes were introduced for
Edges, Vertices.

NOTE: this change requires only one block per MPI task and will abort if
more than one block per MPI task is attempted. Any multi-block tests
will fail.

This has been tested on summit in a QU240 test case and is bit-for-bit
since the structure itself is not yet being used anywhere. However, I
have verified the data is correctly transferred to the GPU accelerator
on Summit.

[b4b]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants