Skip to content

Conversation

@mark-petersen
Copy link
Contributor

This is a simple mistake that needs to be fixed. Causes a seg fault when using more than three regions.

@mark-petersen
Copy link
Contributor Author

My apologies for this. When Nils was originally testing this AM, the dimension maxRegionsInGroup was only in the MHT, so we put in a 3. We should have put a comment with a warning.

@mark-petersen
Copy link
Contributor Author

@vanroekel could you test this on your new run on COMPY with the new MOC mask file? It should work.

@vanroekel
Copy link
Contributor

@mark-petersen sure, I will test this. For some reason I can't get on compy right now but will test this as soon as I can get back on.

@milenaveneziani
Copy link
Contributor

hmm, I should have caught this. thanks @mark-petersen.

@mark-petersen
Copy link
Contributor Author

No problem. It was a silly past reviewing oversight on my part.

@mark-petersen
Copy link
Contributor Author

E3SM with this change passed:

./create_test SMS_Ld5.T62_oEC60to30v3.GMPAS-IAF.cori-knl_intel

including a call to the moc analysis member.

@vanroekel
Copy link
Contributor

okay I'm back on compy. testing now.

@vanroekel
Copy link
Contributor

with the change on compy the segfault moves it is now here (bold line below)

         !!!! TRANSECT CALCULATION
         sumTransport = 0.0_RKIND
         do iTransect = 1,transectsInAddGroup
            currentTransect = transectsInGroup(iTransect, transectGroupNumber)
            do iEdge = 1,nEdgesSolve
               c1 = cellsOnEdge(1,iEdge)
               c2 = cellsOnEdge(2,iEdge)
               do k = 1, maxLevelEdgeTop(iEdge)
                  sumTransport(k,iTransect) = sumTransport(k,iTransect) + &
                     transectEdgeMaskSigns(currentTransect,iEdge) &
                     * transectEdgeMasks(currentTransect, iEdge) &
                     *** normalVelocity(k,iEdge)*dvEdge(iEdge) &**
                     * 0.5_RKIND*(layerThickness(k,c1) + layerThickness(k,c2))
               end do
            end do
            do k = 2, nVertLevels
               mocStreamValLatAndDepthRegionLocal(1, k, iTransect) = &
                  mocStreamValLatAndDepthRegionLocal(1, k - 1, iTransect) &
                  + sumTransport(k - 1, iTransect)
            end do
         end do

@vanroekel
Copy link
Contributor

sorry, the bold didn't work, it is line 514, of the moc_streamfunction.F file

@mark-petersen
Copy link
Contributor Author

OK. I'll see if I can reproduce in stand-alone. @milenaveneziani I got the namelist and streams from compy.

@xylar
Copy link
Collaborator

xylar commented Sep 19, 2019

@vanroekel, this is maybe the better way to show one or more relevant lines of code like you were trying to do:
https://github.com/mark-petersen/MPAS-Model/blob/ocean/fix_moc_error/src/core_ocean/analysis_members/mpas_ocn_moc_streamfunction.F#L514
Let me know if you don't know how to do this.

@milenaveneziani
Copy link
Contributor

@mark-petersen: look at line 382. There is another dimension 3 hard wired.

@vanroekel
Copy link
Contributor

@xylar thanks for the suggestion, that is much better. I was able to figured out how to do that now.

@vanroekel
Copy link
Contributor

@mark-petersen and @milenaveneziani I just tested with changing L382 and have a segfault in the same place.

@mark-petersen
Copy link
Contributor Author

OK, I'll test in stand-alone to debug.

@mark-petersen
Copy link
Contributor Author

mark-petersen commented Sep 20, 2019

OK, there were two more problems:

  1. The seg fault mentioned in Fix analysis member MOC region dimension #374 (comment). When we use:
    config_am_mocstreamfunction_normal_velocity_value = 'normalTransportVelocity'
    it tries to access from the 'state' var_struct rather than 'diagnostic' here:
    496 call mpas_pool_get_array(statePool, normalVelocityArrayName, normalVelocity, timeLevel)
    I propose that we always use 'normalTransportVelocity' here for all simulations, and remove this flag.

  2. The StrLen in the new MOC mask file is 1024 rather than 64. This causes all string arrays to be read in as blank. In particular, transectGroupNames and regionGroupNames, so a match is never found, and regionGroupNumber and transectGroupNumber remain zero and can cause a seg fault. It appears StrLen=64 is hardwired within MPAS framework, though I couldn't find it. The simple solution here is to just use StrLen=64. On the current mask file:
    ncks -d StrLen,0,63 in.nc out.nc
    (Thanks @milenaveneziani for pointing this out)

@milenaveneziani
Copy link
Contributor

Thanks @mark-petersen for summarizing these. Are you going to try a standalone test with the changes you propose?

@xylar
Copy link
Collaborator

xylar commented Sep 20, 2019

I realize we don't have time for this now but can we make an issue about trying to fix the hard-coding of StrLen instead of shortening strings? It would be really easy to end up in situations where 64 characters isn't enough for a unique name and chopping strings is obliterating a bunch of provenance.

@mark-petersen
Copy link
Contributor Author

A problem will immediately arise, which is that StrLen is a single dimension for all of MPAS. I'm not sure if it's defined in framework, or if it gets it from the first file it reads in. Regardless, we would have to change the string length on ALL files read into any MPAS simulation, and break backwards compatibility with strings in older files.

A second option would be to change the framework to test the string length on input and alter 64 character strings. Either way, I don't think 64 character strings is limiting enough for that much trouble.

@mark-petersen
Copy link
Contributor Author

@vanroekel I fixed both problems I described above in #374 (comment) and also made the AM more fail safe. You should reduce the mask file to 64 character length with
ncks -d StrLen,0,63 in.nc out.nc
But if we use a 1024 character one by mistake, it will produce write a warning and use the first region and transect, but still run with this file.

I tested in stand-alone with both mask files. @vanroekel please test in your run.

All the AMs deserve more attention, so we can use multiple region groups and transect groups. It's on my list, but this will have to do for today.

@xylar
Copy link
Collaborator

xylar commented Sep 20, 2019

I really disagree about supporting reading the first 64 char of a longer string. I somewhat disagree about 64 char being sufficient but realistically I won't be able to change this...

@vanroekel
Copy link
Contributor

@mark-petersen my testing found another issue here
https://github.com/MPAS-Dev/MPAS-Model/blob/ocean/develop/src/core_ocean/analysis_members/Registry_moc_streamfunction.xml#L72-L74

This array is hardcoded to 3 regions and is causing a segfault in my testing with the new transect file where there are 4 regions. -- The segfault is here
https://github.com/MPAS-Dev/MPAS-Model/blob/ocean/develop/src/core_ocean/analysis_members/mpas_ocn_moc_streamfunction.F#L239

Is there a reason for that array (minMaxLatRegion) to be in the pool and not an allocatable in the AM like the other minMax arrays?

@mark-petersen
Copy link
Contributor Author

I changed the dimension to nRegions.

@vanroekel
Copy link
Contributor

Thanks. With the last changes, I was able to run on compy with the new transect. I'll try post some pictures from the two runs tomorrow.

@vanroekel
Copy link
Contributor

vanroekel commented Sep 23, 2019

Here are some figures of the AMOC from months 1-6 of a G-case
The new transect file
AMOC_newtransect
Here is the original
AMOC_origtransect
Here is the difference (new transect - old transect)
AMOC_new_minus_orig_transect
They are qualitatively similar, with the biggest differences around 35N and at depth. I'm not sure I'd read too much into such a short simulation at depth anyway.

cc'ing @maltrud for his input on the differences as well.

@milenaveneziani
Copy link
Contributor

Thanks @vanroekel !
Interesting pattern between 35 and 40..
Also interesting that the difference at 26N (where we have been focusing in terms of difference old-new in the offline MOC calculation) is not that striking instead.

@vanroekel
Copy link
Contributor

@milenaveneziani looks like the diffs in the NH could be attributed to differences in regions between the files. Here is a picture (red is old file, black is new). The old did not include the med or far north seas.

region_definitions_atl

@milenaveneziani
Copy link
Contributor

yes, that sounds reasonable.
Should we approve of these new transects, then?

I should also check the MOC results for the other 3 transects, to see if the numbers make sense.

@xylar
Copy link
Collaborator

xylar commented Oct 14, 2019

@milenaveneziani, @mark-petersen and @vanroekel, what is the status here? What needs to happen before this can be merged?

@xylar
Copy link
Collaborator

xylar commented Oct 14, 2019

in #374 (comment), @mark-petersen wrote:

The StrLen in the new MOC mask file is 1024 rather than 64. This causes all string arrays to be read in as blank. In particular, transectGroupNames and regionGroupNames...

Two years ago, I made this change in the MPAS Mask creator:
MPAS-Dev/MPAS-Tools#185
No one pointed out any issues this might have at the time. No one has reported any problems using masks created with the tool until now. So it seems like the problem with longer strings may be specific to the MOC analysis member rather than MPAS-Ocean more broadly. That, or maybe the strings in mask files are not being used in other contexts?

Does this change to the mask creator need to be reverted? As I said, I see the 64 character limit as a big problem for provenance but @mark-petersen's suggested NCO trick (ncks -d StrLen,0,63 in.nc out.nc) doesn't seem like a desirable step to have to add to every workflow that uses masks.

@xylar
Copy link
Collaborator

xylar commented Oct 14, 2019

I can easily incorporate a step that truncates strings to 64 characters in the tool that creates the southern MOC transects if the problem is specific to the MOC analysis member. I'll hold off on doing so, however, until it's clear whether or not we want to change this in the mask creator itself instead.

Copy link
Contributor

@vanroekel vanroekel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good and is tested on compy in a G-case. It works for current and new transect/region files.

@mark-petersen
Copy link
Contributor Author

We are merging in all PRs on the E3SM V2 PR list (made in June) first. Then we can merge all these more recent ones in, probably in two weeks.

@milenaveneziani
Copy link
Contributor

I can easily incorporate a step that truncates strings to 64 characters in the tool that creates the southern MOC transects if the problem is specific to the MOC analysis member

@xylar, we would have to check to make sure, but I predict this is a problem with at least the MOC and MHT analysis members. We would also have to check the relatively recent changes made to surfaceAverages and layerWeigthedAverages to do regional averages.
Mind you that this is the first time that we tried to use the new mask files with the analysis members, which is why these problems haven't come up before.

This fixes these:
1. The seg fault mentioned in
MPAS-Dev#374 (comment).
When we use:
`config_am_mocstreamfunction_normal_velocity_value =
'normalTransportVelocity'`
it tries to access from the 'state' var_struct rather than 'diagnostic'
here:
`496          call mpas_pool_get_array(statePool,
normalVelocityArrayName, normalVelocity, timeLevel)`
I propose that we always use 'normalTransportVelocity' here for all
simulations, and remove this flag.

2. The StrLen in the new MOC mask file is 1024 rather than 64. This
causes all string arrays to be read in as blank. In particular,
`transectGroupNames` and `regionGroupNames`, so a match is never found,
and `regionGroupNumber` and `transectGroupNumber` remain zero and can
cause a seg fault.  It appears StrLen=64 is hardwired within MPAS
framework, though I couldn't find it. The simple solution here is to
just use StrLen=64. On the current mask file:
`ncks -d StrLen,0,63 in.nc out.nc`
mark-petersen added a commit that referenced this pull request Oct 23, 2019
This is a simple mistake that needs to be fixed. Causes a seg fault when
using more than three regions.
@mark-petersen mark-petersen merged commit 923a7bd into MPAS-Dev:ocean/develop Oct 23, 2019
@mark-petersen mark-petersen deleted the ocean/fix_moc_error branch October 23, 2019 13:39
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
This fixes these:
1. The seg fault mentioned in
MPAS-Dev#374 (comment).
When we use:
`config_am_mocstreamfunction_normal_velocity_value =
'normalTransportVelocity'`
it tries to access from the 'state' var_struct rather than 'diagnostic'
here:
`496          call mpas_pool_get_array(statePool,
normalVelocityArrayName, normalVelocity, timeLevel)`
I propose that we always use 'normalTransportVelocity' here for all
simulations, and remove this flag.

2. The StrLen in the new MOC mask file is 1024 rather than 64. This
causes all string arrays to be read in as blank. In particular,
`transectGroupNames` and `regionGroupNames`, so a match is never found,
and `regionGroupNumber` and `transectGroupNumber` remain zero and can
cause a seg fault.  It appears StrLen=64 is hardwired within MPAS
framework, though I couldn't find it. The simple solution here is to
just use StrLen=64. On the current mask file:
`ncks -d StrLen,0,63 in.nc out.nc`
ashwathsv pushed a commit to ashwathsv/MPAS-Model that referenced this pull request Jul 21, 2020
This is a simple mistake that needs to be fixed. Causes a seg fault when
using more than three regions.
mark-petersen added a commit to mark-petersen/MPAS-Model that referenced this pull request Jan 11, 2021
This fixes these:
1. The seg fault mentioned in
MPAS-Dev#374 (comment).
When we use:
`config_am_mocstreamfunction_normal_velocity_value =
'normalTransportVelocity'`
it tries to access from the 'state' var_struct rather than 'diagnostic'
here:
`496          call mpas_pool_get_array(statePool,
normalVelocityArrayName, normalVelocity, timeLevel)`
I propose that we always use 'normalTransportVelocity' here for all
simulations, and remove this flag.

2. The StrLen in the new MOC mask file is 1024 rather than 64. This
causes all string arrays to be read in as blank. In particular,
`transectGroupNames` and `regionGroupNames`, so a match is never found,
and `regionGroupNumber` and `transectGroupNumber` remain zero and can
cause a seg fault.  It appears StrLen=64 is hardwired within MPAS
framework, though I couldn't find it. The simple solution here is to
just use StrLen=64. On the current mask file:
`ncks -d StrLen,0,63 in.nc out.nc`
mark-petersen added a commit to mark-petersen/MPAS-Model that referenced this pull request Jan 11, 2021
This is a simple mistake that needs to be fixed. Causes a seg fault when
using more than three regions.
jonbob pushed a commit to E3SM-Project/E3SM that referenced this pull request Jun 22, 2021
This fixes these:
1. The seg fault mentioned in
MPAS-Dev/MPAS-Model#374 (comment).
When we use:
`config_am_mocstreamfunction_normal_velocity_value =
'normalTransportVelocity'`
it tries to access from the 'state' var_struct rather than 'diagnostic'
here:
`496          call mpas_pool_get_array(statePool,
normalVelocityArrayName, normalVelocity, timeLevel)`
I propose that we always use 'normalTransportVelocity' here for all
simulations, and remove this flag.

2. The StrLen in the new MOC mask file is 1024 rather than 64. This
causes all string arrays to be read in as blank. In particular,
`transectGroupNames` and `regionGroupNames`, so a match is never found,
and `regionGroupNumber` and `transectGroupNumber` remain zero and can
cause a seg fault.  It appears StrLen=64 is hardwired within MPAS
framework, though I couldn't find it. The simple solution here is to
just use StrLen=64. On the current mask file:
`ncks -d StrLen,0,63 in.nc out.nc`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants