New ocean mesh structure with GPU replication#496
New ocean mesh structure with GPU replication#496mark-petersen merged 4 commits intoMPAS-Dev:ocean/developfrom philipwjones:ocean/gpumesh
Conversation
mattdturner
left a comment
There was a problem hiding this comment.
I'll look through this today.
|
Approved per visual inspection and @philipwjones testing. |
|
While resolving the redundant OpenACC ifdef, found some debug output that hadn't been removed and realized I hadn't added this file to the Cmake build, so just committed/pushed those mods while also removing the redundant ifdef |
|
Pushed a bug fix for bad dimensions on two edgeSign arrays |
|
Should there be a
|
|
Thanks @mattdturner I also caught another one (nVerticesArray) and there are probably some more. I caught everything that I found documented in Registry and the struct include file, but there are clearly a number of fields in the dimension subpool that I didn't pick up. This stuff is so opaque, I can't find a place with a definitive list so might have to resort to a massive grep... |
As we work through updating the files for the GPU-related work, any variable that was missed will make itself known. |
|
Added the relevant missing variables discussed above. However, I have renamed/refactored them a bit. So instead of nCells, nCellsSolve and nCellsArray, the new structure has nCellsAll, nCellsOwned, and nCellsHalo(n) where nCellsHalo refers only to the halo levels, i.e. nCellsHalo(1) has the number of owned+halo cells in the first halo level. The old nCellsArray had the first entry as nCellsOwned/Solve so this is a conceptual change. Similar constructs for Edges and Vertices have been introduced. The old names/structures still exist in the old mesh pool. |
|
Just committed a new version that eliminates the structure in favor of public mesh variables that are accessed with a module use. The prior structure did not perform well on GPU due to the need to traverse such a large structure to find variables. I also squashed some of the prior commits for bug fixes so that this is a little cleaner. I will update the PR text above to better reflect these latest changes. |
There was a problem hiding this comment.
This still references the ocnMesh derived type, which has been removed.
|
Should the
|
|
These reference arrays weren't used that often since they only refer to the reference values and not the time-variable depths. But happy to include them if needed. |
|
I'm not sure if its totally necessary. I just noticed that the |
This change adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. The module also redefines some index variables related to halos and owned cells. Rather than nCells, nCellsSolve and nCellsArray, the new mesh structure now has nCellsAll, nCellsOwned and nCellsHalo(n) with similar variables for Edges and Vertices. Unlike nCellsArray, the nCellsHalo only includes the final index for each halo depth n, (the prior nCellsArray variable included nCellsOwned as first entry). NOTE: this change requires only one block per node and will abort if more than one block per node is attempted.
original mesh structure proved not to be performant on GPU so switched to accessing mesh variables as public module variables instead
|
Rebased, passes nightly regression suite with gnu and intel, debug and optimized. bfb with previous, except block tests with 2 blocks/core fail, as expected. |
* ocean/develop: New ocean mesh structure with GPU replication #496
mark-petersen
left a comment
There was a problem hiding this comment.
Passes the following in E3SM:
PET_Ln9.T62_oQU240.GMPAS-IAF.cori-knl_gnu
PEM_Ln9.T62_oQU240.GMPAS-IAF.cori-knl_intel
PET_Ln3.T62_oEC60to30v3wLI.GMPAS-DIB-IAF-ISMF.cori-knl_intel
New ocean mesh structure with GPU replication This PR brings in a new mpas-source submodule with changes only to the ocean core. It adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. It is a critical piece of infrastructure needed for future GPU modifications being staged. In a later change, this structure was replaced by public module variables instead since a large user-defined type did not perform well on GPUs. Mesh variables can now be accessed just by "use"-ing this module. This module also has a revised index naming in which nCells, nCellsSolve and nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the latter differs from nCellsArray by only including the index of each halo depth and no longer includes nCellsOwned as the first entry. Similar changes were introduced for Edges, Vertices. NOTE: this change requires only one block per MPI task and will abort if more than one block per MPI task is attempted. Any multi-block tests will fail. This has been tested on summit in a QU240 test case and is bit-for-bit since the structure itself is not yet being used anywhere. However, I have verified the data is correctly transferred to the GPU accelerator on Summit. See MPAS-Dev/MPAS-Model#496 by @philipwjones [BFB]
New ocean mesh structure with GPU replication This PR brings in a new mpas-source submodule with changes only to the ocean core. It adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. It is a critical piece of infrastructure needed for future GPU modifications being staged. In a later change, this structure was replaced by public module variables instead since a large user-defined type did not perform well on GPUs. Mesh variables can now be accessed just by "use"-ing this module. This module also has a revised index naming in which nCells, nCellsSolve and nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the latter differs from nCellsArray by only including the index of each halo depth and no longer includes nCellsOwned as the first entry. Similar changes were introduced for Edges, Vertices. NOTE: this change requires only one block per MPI task and will abort if more than one block per MPI task is attempted. Any multi-block tests will fail. This has been tested on summit in a QU240 test case and is bit-for-bit since the structure itself is not yet being used anywhere. However, I have verified the data is correctly transferred to the GPU accelerator on Summit. See MPAS-Dev/MPAS-Model#496 by @philipwjones [BFB]
New ocean mesh structure with GPU replication MPAS-Dev#496 This change adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. It is a critical piece of infrastructure needed for future GPU modifications being staged. In a later change, this structure was replaced by public module variables instead since a large user-defined type did not perform well on GPUs. Mesh variables can now be accessed just by "use"-ing this module. This module also has a revised index naming in which nCells, nCellsSolve and nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the latter differs from nCellsArray by only including the index of each halo depth and no longer includes nCellsOwned as the first entry. Similar changes were introduced for Edges, Vertices. NOTE: this change requires only one block per MPI task and will abort if more than one block per MPI task is attempted. Any multi-block tests will fail. This has been tested on summit in a QU240 test case and is bit-for-bit since the structure itself is not yet being used anywhere. However, I have verified the data is correctly transferred to the GPU accelerator on Summit. [b4b]
This change adds a mesh structure with all current mesh pool data and replicates that data on a GPU (or other accelerator) for the duration of the simulation to reduce the need for data transfers of static mesh data. It is a critical piece of infrastructure needed for future GPU modifications being staged. In a later change, this structure was replaced by public module variables instead since a large user-defined type did not perform well on GPUs. Mesh variables can now be accessed just by "use"-ing this module.
This module also has a revised index naming in which nCells, nCellsSolve and nCellsArray are replaced by nCellsAll, nCellsOwned, and nCellsHalo(n) where the latter differs from nCellsArray by only including the index of each halo depth and no longer includes nCellsOwned as the first entry. Similar changes were introduced for Edges, Vertices.
NOTE: this change requires only one block per MPI task and will abort if more than one block per MPI task is attempted. Any multi-block tests will fail.
This has been tested on summit in a QU240 test case and is bit-for-bit since the structure itself is not yet being used anywhere. However, I have verified the data is correctly transferred to the GPU accelerator on Summit.
[b4b]