Skip to content

Improve time and memory of plot_initial_state#664

Merged
mark-petersen merged 0 commit intoMPAS-Dev:ocean/developfrom
xylar:ocean/fix_plot_initial_state
Aug 27, 2020
Merged

Improve time and memory of plot_initial_state#664
mark-petersen merged 0 commit intoMPAS-Dev:ocean/developfrom
xylar:ocean/fix_plot_initial_state

Conversation

@xylar
Copy link
Collaborator

@xylar xylar commented Aug 25, 2020

This merge converts plot_initial_state to use xarray with chunking to use dask parallelism. It also uses xarray's version of
histograms. The plotting of the largest COMPASS mesh (ARM) goes for timing out or running out of memory even for 3 hour jobs to running in 15 seconds and using about 1% of the memory of a Grizzly compute node.

closes #663

@xylar
Copy link
Collaborator Author

xylar commented Aug 25, 2020

Testing

So far, I tested only on ARM on a Grizzly compute node and only on an already existing initial_state.nc. I' have not tested on other test cases, login nodes, or performing the full COMPASS workflow yet.

Comment on lines 32 to 33
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the data set into "chunks" of 32k cells or edges per "chunk". Chunks are distributed across multiple threads (36 on a Grizzly compute node) whenever a computation like min(), max(), xarray.plot.hist(), etc. is performed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always define the index variables right away to be their zero-based python variants.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the key: the xarray version of hist is way better, it seems.

@xylar xylar force-pushed the ocean/fix_plot_initial_state branch from 8e4a855 to 8f5310a Compare August 25, 2020 09:33
Comment on lines 60 to 61
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xarray version of min and max know to ignore masked (NaN) entries.

Comment on lines 82 to 83
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By making these masks into xarray.DataArrays, we can use them to mask other data arrays with where().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Select time index 0 of the given variable, then mask it by putting NaNs for invalid cells.

@xylar xylar force-pushed the ocean/fix_plot_initial_state branch from 8f5310a to 588f322 Compare August 25, 2020 09:37
var = ncfile.variables[varName][0, :, :][edgeMask]
plt.hist(var, bins=100, log=True)
var = ds[varName].isel(Time=0).where(edgeMask)
maxRx1Edge = var.max().values
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compute the max Haney number after masking.

Comment on lines 128 to 130
plt.tight_layout(pad=4.0)

plt.savefig(args.output_file_name, bbox_inches='tight', pad_inches=0.1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The layout of the figure is a lot better with these settings, taken from MPAS-Analysis climatology map plots.

@xylar
Copy link
Collaborator Author

xylar commented Aug 25, 2020

Here's the plot for ARM:
initial_state

@mark-petersen
Copy link
Contributor

Wow, this is fantastic, and serves as a great example of how to convert a post-processing code to xarray and dask. Thank you, @xylar! I'll test this with upcoming batch of COMPASS changes.

Copy link
Contributor

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that this ran correctly in an EC60to30. Plots look correct. Only took 12s to run! Thanks again!

@mark-petersen mark-petersen merged commit ea99a41 into MPAS-Dev:ocean/develop Aug 27, 2020
@xylar xylar deleted the ocean/fix_plot_initial_state branch August 27, 2020 17:30
mark-petersen added a commit that referenced this pull request Aug 27, 2020
Merge PR #563 'xylar/ocean/remove_jigsaw_to_mpas' into ocean/develop
Merge PR #662 'xylar/ocean/fix_broken_compass_tests' into ocean/develop
Merge PR #664 'xylar/ocean/fix_plot_initial_state' into ocean/develop
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request Aug 27, 2020
…3800)

Update mpas-source: compass only

This PR brings in a new mpas-source submodule with changes only to the
ocean core. All of the changes are to the COMPASS testing code that is
not used in E3SM, so this PR should have no effect on E3SM. The COMPASS
commits are:
* MPAS-Dev/MPAS-Model#563 'xylar/ocean/remove_jigsaw_to_mpas'
* MPAS-Dev/MPAS-Model#662 'xylar/ocean/fix_broken_compass_tests'
* MPAS-Dev/MPAS-Model#664 'xylar/ocean/fix_plot_initial_state'

[BFB]
jonbob added a commit to E3SM-Project/E3SM that referenced this pull request Aug 28, 2020
Update mpas-source: compass only

This PR brings in a new mpas-source submodule with changes only to the
ocean core. All of the changes are to the COMPASS testing code that is
not used in E3SM, so this PR should have no effect on E3SM. The COMPASS
commits are:
* MPAS-Dev/MPAS-Model#563 'xylar/ocean/remove_jigsaw_to_mpas'
* MPAS-Dev/MPAS-Model#662 'xylar/ocean/fix_broken_compass_tests'
* MPAS-Dev/MPAS-Model#664 'xylar/ocean/fix_plot_initial_state'

[BFB]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants