-
Notifications
You must be signed in to change notification settings - Fork 12
Description
I have some old xls files that I am parsing. In some of them, when parse them with python-calamine, have wrong values in the first two columns.
workbook = CalamineWorkbook.from_path(str(path))
for sheet_name in workbook.sheet_names:
rows_2d = workbook.get_sheet_by_name(sheet_name).to_python(
skip_empty_area=True,
)
print(rows_2d[0][0:3])outputs
[0.001, 22.48, ''].
This is not the right content. If one looks at all fields, it shows that the values in the first two columns are always wrong. They should be mixed types, but calamine returns floats that look like they come from elsewhere in the file.
In comparison with xlrd:
book = xlrd.open_workbook(str(path))
for sh in book.sheets():
for rx in range(sh.nrows):
print(sh.row(rx)[0:3])
breakoutputs
[text:'<correct text>', empty:'', empty:'']
which is correct (I replaced the string).
When I open the xls in Excel and save it again, this behavior goes away (they also become smaller for some reason). Therefore I can not "censor" it. Those are company files so I can not upload one here uncensored.
Is there something we can do about this?