-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Closed
Labels
BugNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import numpy as np
x1 = pd.Series([1, 2, None, 4])
x2 = pd.Series([1, 2, np.nan, 4])
x3 = pd.Series([1, 2, pd.NA, 4])
print(x1.to_numpy('int32', na_value=0))
# [1 2 0 4]
print(x2.to_numpy('int32', na_value=0))
# [1 2 0 4]
print(x3.to_numpy('int32', na_value=0))
# Traceback (most recent call last):
# File "<input>", line 14, in <module>
# print(x3.to_numpy('int32', na_value=0))
# File "C:\src\venv\w310\lib\site-packages\pandas\core\base.py", line 535, in to_numpy
# result = np.asarray(self._values, dtype=dtype)
# TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NAType'
print(x3.to_numpy('float64', na_value=0))
# Traceback (most recent call last):
# File "<input>", line 22, in <module>
# print(x3.to_numpy('float64', na_value=0))
# File "C:\src\venv\w310\lib\site-packages\pandas\core\base.py", line 535, in to_numpy
# result = np.asarray(self._values, dtype=dtype)
# TypeError: float() argument must be a string or a real number, not 'NAType'Issue Description
It appears that a Series that has a missing value that was created using either None or np.nan can be replaced by using Series.to_numpy(dtype=, na_value=), but one created with pd.NA fails with a raised exception (both arguments must be specified to trigger the behavior).
Expected Behavior
It is expected that since all three values (None, np.nan, and pd.NA) all represent missing values, that all three should behave the same. For the above reproducible example, the print statements should all report [1 2 0 4] (or [1. 2. 0. 4.] for the fourth 'float64' case).
Installed Versions
Details
INSTALLED VERSIONS ------------------ commit : 87cfe4e python : 3.10.6.final.0 python-bits : 64 OS : Windows OS-release : 10 Version : 10.0.19043 machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel byteorder : little LC_ALL : None LANG : None LOCALE : English_United States.1252 pandas : 1.5.0 numpy : 1.23.3 pytz : 2022.2.1 dateutil : 2.8.2 setuptools : 65.3.0 pip : 22.2.2 Cython : 0.29.32 pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.9.1 html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None matplotlib : 3.6.0 numba : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyreadstat : None pyxlsb : None s3fs : None scipy : None snappy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None zstandard : None tzdata : NoneReplace this line with the output of pd.show_versions()Metadata
Metadata
Assignees
Labels
BugNA - MaskedArraysRelated to pd.NA and nullable extension arraysRelated to pd.NA and nullable extension arrays