-
-
Notifications
You must be signed in to change notification settings - Fork 11.9k
BUG: Fix argsort vs sort in Masked arrays #8678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a choice we need to make here. Right now (and before this patch), np.ma.sort sometimes returns an ma.array, and sometimes an ndarray.
Should we change it to always promote to ma.array, for consistency with the other functions? (Which do this as of #8665 )
f498c2c to
34a02d7
Compare
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory we could do a little better here, by implementing the in-place sort in terms of the non-in place. Right now (and before this patch), the latter does a redundant copy. So either way, best left for another PR, I think
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
endwith needs to be after fill_value to not break the api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing axis default also probably breaks api
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not changing it breaks liskov substitution though. The default values should really match ndarray. Fair call on swapping argument order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if they matched but unfortunately that ship has sailed, this default argument is part of our api and we can't change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliantaylor: Is there a deprecation path for changing it then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding named argument order, perhaps we should enforce that endwith is a boolean, just to help guard against someone (stupidly) passing unnamed arguments, and assuming they match the order of sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a way we could deprecate the axis argument value besides adding an argsort2 and deprecating the old one.
sort has a different order? that is unfortunate ...
I guess we could try to get away with that change, usage of positional arguments for the ones at the end are probably not very common in the wild.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, argsort is documented as having a default argument of -1, even though the actual default is None. So can we update the API to match the documentation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re deprecation:
if axis is np._NoValue:
if self.ndim > 1:
warn("Unlike np.argsort and the documentation for this function, the"
"default axis argument is None, not -1. This only matters for 2 or"
"higher-dimensional arrays. If this is intended, pass it explicitly to squash this warning")
axis = None34a02d7 to
b5c4f9a
Compare
|
@juliantaylor: OK, clearly changing the axis default does not belong in this PR. I've created an issue about that at #8701. I think I agree with you that using positional arguments is unlikely in the wild for |
numpy/ma/core.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like these shortcuts very much, the explicit meshgrid is easier to read imo
there isn't really an advantage to using ix_ is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a shortcut though? The two functions are independant, with neither calling the other. Meshgrid doesn't do what we need by default, but ix_ does.
Furthermore, ix_ is used as an indexer in its examples, whereas meshgrid is used to evaluate functions. The former use is what we want here.
|
@eric-wieser - Overall, this looks good. The main possible issue I see is that in adding the additional All this means it is a bit of an API change, so definitely needs an entry in the release notes (and arguably a note to the mailing list). |
I don't really want a feature needing discussion to hold up a bugfix.
@juliantaylor seemed to think we could get away with it (hidden comment) as well
I'll update this accordingly then
Will do |
|
OK, sounds good. I do think this has gone beyond a bug fix (ie., MAINT as well), which I why I set the milestone to 1.13. (I think a bug fix would only touch the default fill value, but do not really think this is worth implementing). |
06c40c3 to
37bb166
Compare
Actually, I'm backtracking on this. While I think that all the Release notes added, should be good to go. |
mhvk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one annoying tidbit, otherwise good to go
Previously, these had different rules for unmasking values, and even different arguments to decide how to do so. Fixes numpy#8664
37bb166 to
ee90efc
Compare
|
Done, I think - just waiting on tests. |
|
This fixes #8664, at the cost of a behavioural change in
argsort-- masked values are now sorted to the end by default, whereas before they could end up being placed somewhere arbitrary in the middle. This brings the behaviour ofargsortinline withsort.In particular,
argsortwould use the default fill value, which forint8types would be999999 % 256 = 63. This sorted very misleadingly.sortmade the more sensible choice of using the maximum/minimum possible value.This also fixes code duplication between the method and function forms of these operations