I have written this date parsing function
def date_parser(string):
try:
date = pd.datetime.strptime(string, "%d/%m/%Y")
except:
date = pd.NaT
return date
and I call it in pd.read_csv like this
df = pd.read_csv(os.path.join(path, file),
sep=";",
encoding="latin-1",
keep_default_na=False,
na_values=na_values,
index_col=False,
usecols=keep,
dtype=dtype,
date_parser=date_parser,
parse_dates=dates)
The problem is that in one of my dates column, I end up with mixed data types
df[data].apply(type).value_counts()
- class 'datetime.datetime'
- class 'pandas._libs.tslibs.timestamps.Timestamp'
- class 'pandas._libs.tslibs.nattype.NaTType'
I should only have the last two right?