I'm reading an xls file and converting to csv file in databricks using pyspark. My input data is of string format 101101114501700 in the xls file. But after converting it to CSV format using pandas and writing to the datalake folder my data is showing as 101101114501700.0. My code is given below. Please help me why am I getting the decimal part in the data.
for file in os.listdir("/path/to/file"):
if file.endswith(".xls"):
filepath = os.path.join("/path/to/file",file)
filepath_pd = pd.ExcelFile(filepath)
names = filepath_pd.sheet_names
df = pd.concat([filepath_pd.parse(name) for name in names])
df1 = df.to_csv("/path/to/file"+file.split('.')[0]+".csv", sep=',', encoding='utf-8', index=False)
print(time.strftime("%Y%m%d-%H%M%S") + ": XLS files converted to CSV and moved to folder"