0

Import Text

(character set ,seperator options, and Fields in Apache Open Office)

The original file I have is a csv file. I want to change the character set ,seperator options, and Fields using python code and save it into an excel file. Is it possible to auto fill in these options? Or else what language can I use to automate this action? ( I cant do this in excel because excel will delete some of my special character.)

1 Answer 1

2

You can use xlsxwriter module to make XLSX files: https://xlsxwriter.readthedocs.io/index.html

Assume you have a CSV file with encoding CP1251 and you want to get XLSX file with encoding UTF8. Here is how it can be done:

import xlsxwriter # pip3 install xlsxwriter

# get data from the csv file with non utf8 encoding
with open('data_cp1251.csv', 'r', encoding='cp1251') as f:
    data = f.read()

# convert the data into 2d array
table = [row.split(",") for row in data.split("\n")]

# create xlsx file (utf8 encoding by default)
ss = xlsxwriter.Workbook('data.xlsx')
s = ss.add_worksheet()

# fill the xlsx file with the 2d array
for row_num, row in enumerate(table):
    for col_num, cell in enumerate(row):
        s.write(row_num, col_num, cell)

ss.close() # here you get the 'data.xlsx' file

For simply cases it works even if the source CSV file has tab \t separators. But it's need to test on your real data.

And as far as I can tell all fields in the new xlsx file are 'text fields' by default. You can change their formats any time, see here: https://xlsxwriter.readthedocs.io/format.html#format

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.