Skip to content

Conversation

@phofl
Copy link
Member

@phofl phofl commented Nov 22, 2022

This is something we have to fix before we can address the issue

Its also significantly faster

       before           after         ratio
     [80527f48]       [b73beee0]
                      <49830>   
-        68.3±1ms      2.26±0.04ms     0.03  multiindex_object.Putmask.time_putmask_all_different
-      72.4±0.6ms      2.20±0.03ms     0.03  multiindex_object.Putmask.time_putmask

expected = MultiIndex.from_tuples([right[0], right[1], left[2]])
tm.assert_index_equal(result, expected)

def test_putmask_keep_dtype(self, any_numeric_ea_dtype):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the two test cases also covering for the all-nan case in #49830 (comment) or do you think it's not necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's about losing the dtype, so yes this covers the all nan case. But this does not fix the issue yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable might be a misnomer then any_numeric_ea_dtype (if it covers all nans, this would also include non EA dtype).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It covers the case in a sense that the current tests represent the all nan case as well, I’ll add a specific test when actually fixing the issue anyway

zip(subset.levels, self.levels, self.codes)
):
new_elements = value_level.difference(level)
new_level = level.append(new_elements)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the two lines above be replaced with new_level = level.union(value_level)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, was thinking about duplicates when implementing this, but levels are unique…

@phofl phofl mentioned this pull request Nov 23, 2022
5 tasks
@phofl
Copy link
Member Author

phofl commented Nov 23, 2022

failure unrelated

@mroeschke mroeschke added this to the 2.0 milestone Nov 24, 2022
@mroeschke mroeschke merged commit 2236346 into pandas-dev:main Nov 24, 2022
@mroeschke
Copy link
Member

Thanks @phofl

mliu08 pushed a commit to mliu08/pandas that referenced this pull request Nov 27, 2022
* BUG: MultiIndex.putmask losing ea dtype

* Fix typing

* Add asv

* Simplify and add whatsnew
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants