ENH: add string bin support for multivariate histograms#31276
ENH: add string bin support for multivariate histograms#31276JosephMehdiyev wants to merge 40 commits into
Conversation
This will also add additional string arguments for `bin` for `histogramdd`. Similarly, `histogram2d` will change.
|
Seems like I have focused on the old PR so much that I have missed so many things (other than stubs), please review after PR is ready for review |
Also, this fixes some obvious bugs that was introduced from the last modification of `histogramdd`.
|
@jorenham (afaik you are the author of these stubs) is there a specific reason why overload stubs of for example deleting this @overload
def histogram2d(
x: _ArrayLike1DNumber_co,
y: _ArrayLike1DNumber_co,
bins: _BinKind | Sequence[Sequence[int] | _BinKind],
range: _ArrayLike2DFloat_co | None = None,
density: bool | None = None,
weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[np.int_]: ...since this below already handles that case def histogram2d[ScalarT: _Number_co](
x: _ArrayLike1DNumber_co,
y: _ArrayLike1DNumber_co,
bins: _BinKind | _ArrayLike1D[ScalarT] | Sequence[_ArrayLike1D[ScalarT] | _BinKind],
range: _ArrayLike2DFloat_co | None = None,
density: bool | None = None,
weights: _ArrayLike1DFloat_co | None = None,
) -> _Histogram2D[ScalarT]: ...I have no experience with .pyi files, I am mirroring my knowledge from .hpp and cpp templates FYI |
The difference here is in the So these overloads are distinct and non-overlapping. ... or at least, that's what they should be. You added |
|
Not sure the failing tests are PR related. |
|
hey, can someone give a feedback for this PR? |
|
Other than minor fixes, is the |
|
I had Claude take a look at the stubs, and it actually found some real issues:
|
|
Changes are because of #20215 (comment) |
_get_bin_edges() on histogramdd for consistency.|
stubs: The only things changed in stubs are that strings cannot be in array i.e "auto" is fine but not ["auto"] or ["auto", 2] and bins cannot be complex values. Complex values are unrelated to PR, but might as well fix it here.
|
|
cc @jorenham see the above short comment about the stub changes, be free to review whenever, (if) you want I will continue to update the documentation, tests and clean up some code too. |
PR summary
fixes #20215
also generalizes existing bin width histogram algorithms to arrays (for dimensionality)
majority of them will throw not implemented error when D>1
Other comments
I am not content with "auto" choice but it should be practical enough.
AI Disclosure
I used Claude as sanity check on my fixes. All modifications etc are done solely by my decisions and manually typed or copy pasted documentation from the related PR above.Claude was used extensively on some parts of the code, especially on
_get_bin_edges. Other than that I do not remember to be honest.