Skip to content

Conversation

@jorisvandenbossche
Copy link
Member

Closes #34336

This adds tests with the behaviour as it was on 0.25.3 / 1.0.3, and some changes to get back to that behaviour.
(but whether this behaviour is fully "sane", I am not sure ..)

@jorisvandenbossche jorisvandenbossche added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type labels May 23, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1 milestone May 23, 2020
Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I'm not totally sure about the expected behavior of concat([Sparse, Categorical]). I suspect it was never properly discussed / intentionally implemented. Happy to just match 1.0.3 behavior for now htough.

Co-authored-by: Tom Augspurger <TomAugspurger@users.noreply.github.com>
@jorisvandenbossche
Copy link
Member Author

OK, will go forward with this PR as is (matching the previous behaviour) to unblock the other concat PRs, but will open a set of follow-up issues.

subtype = dtype._subtype_with_str
sp_values = astype_nansafe(self.sp_values, subtype, copy=copy)
# TODO copy=False is broken for astype_nansafe with int -> float
sp_values = astype_nansafe(self.sp_values, subtype, copy=True)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #34456 for this (and added link in the comment)

if is_sparse(arr.dtype) and not is_sparse(dtype):
# problem case: SparseArray.astype(dtype) doesn't follow the specified
# dtype exactly, but converts this to Sparse[dtype] -> first manually
# convert to dense array
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #34457 for this

@jorisvandenbossche
Copy link
Member Author

And opened #34459 for the general "what should concat(sparse, categorical) do?" question

from pandas.core.dtypes.generic import ABCCategoricalIndex, ABCRangeIndex, ABCSeries

from pandas.core.arrays import ExtensionArray
from pandas.core.arrays import ExtensionArray, SparseArray
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I already pushed a fix.
I don't really understand why, though, as SparseArray is included in the pandas.core.arrays init, just as ExtensionArray.

@jorisvandenbossche jorisvandenbossche merged commit cc63484 into pandas-dev:master May 29, 2020
@jorisvandenbossche jorisvandenbossche deleted the concat-sparse-object branch June 7, 2020 09:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type

Projects

None yet

Development

Successfully merging this pull request may close these issues.

REGR: concat of Sparse with incompatible dtype now gives Sparse[object] instead of object

2 participants