Read CSV without string formatting in python

Question

I have a CSV file and I would like to read this cell-by-cell so that I can write it into excel. I am using csv.reader and enumerating the result so that I can put values into corresponding cells in Excel.

With the current code, once I enumerate the values turn into strings. If I write to excel with sheet.write(rowi,coli,value), all cells are formatted as text. I can't have this, because I need to sum columns afterward and they need to be treated as numbers

For example, my text file will have: 1, a, 3, 4.0, 5, 6, 7

After first enumeration, the first row: (0, '1, a, 3, 4.0, 5, 6, 7')

After second enumeration, first column of first row: (0, 0, '1')

QUESTION: How can I read this csv file to yield (0, 0, 1) (etc.)?

Here's some code I'm working with:

import csv, xlwt

with open('file.csv', 'rb') as csvfile:
    data = csv.reader ((csvfile), delimiter=",")
    wbk= xlwt.Workbook()
    sheet = wbk.add_sheet("file")


    for rowi, row in enumerate(data):
        for coli, value in enumerate(row):
            sheet.write(rowi,coli,value)
            #print(rowi,coli,value) gives (rowi, coli, 'value')

why not open the .csv in excel and save as an excel file...? i feel like you can make a VB macro to do that 10x faster — pseudonym117
– pseudonym117, Commented Jun 19, 2014 at 18:06
Perhaps the OP is trying to automate this process over many, many CSV files? Will a VB macro be able to do this? — ericmjl
– ericmjl, Commented Jun 19, 2014 at 18:10
There are many csv files. I want to stick within python as this is just part of a larger automation that needs to be all-inclusive. — MothraDactyl
– MothraDactyl, Commented Jun 19, 2014 at 18:22
Which columns are you attempting to sum, exactly? I'm not sure what you're trying to do with this setup. — benjwadams
– benjwadams, Commented Jun 19, 2014 at 20:22

David · Accepted Answer · 2014-06-19 19:21:00Z

1

import csv, xlwt

with open('file.csv', 'rb') as csvfile:
    data = csv.reader ((csvfile), delimiter=",")
    wbk= xlwt.Workbook()
    sheet = wbk.add_sheet("file")


    for rowi, row in enumerate(data):
        for coli, value in enumerate(row):
            sheet.write(rowi,coli,value)

    wbk.save("workbook_file")

Even though print(rowi,coli,value) shows 'value', the cell in the outputted file should show it without quotes.

If your data is in the format 1, 2, 3 and not 1,2,3 include this after your for coli, value in enumerate(row): line:

value = value.lstrip(" ")

edited Jun 19, 2014 at 19:21

answered Jun 19, 2014 at 18:09

David

711 silver badge9 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

MothraDactyl Over a year ago

Sorry for the misleading example, but the values are a mix of types including text, dates, floats, and ints. I just need to write them as they are, not wrap them with '' so they're interpreted as strings

David Over a year ago

@user3757554 sheet.write() only takes one input. See my edit. Hit the star if it works for you :)

MothraDactyl Over a year ago

I only want to write the value, as is, into the sheet. I say I want to read the csv as (0,0,1) instead of (0,0,'1') for example. But I want to then use this to write into the sheet at cell(0,0) the value 1, not the text '1'

David Over a year ago

@user3757554 Sorry about that, I skimmed the post and didn't realize you were outputting to excel. See edit.

Serge Ballesta · Accepted Answer · 2014-06-19 19:27:21Z

Well I think the csv module of python is still lacking a crystal ball ... More seriously, in the csv file there is no indication of the type of the variable, integer, float, string or date. By default, the Reader transforms a row in an list of strings.

If you want some columns to be integer, you can add to your script a list of boolean. Say you have 4 columns and the third is integer

int_col = [ false, false, true, false ]
...
for rowi, row in enumerate(data):
    for coli, value in enumerate(row):
        val = int(value) if int_col(coli) else value
        sheet.write(rowi,coli,val)

You can also try to guess what columns are integer, reading n rows (for example n = 10) and saying that for each column where you found n integers you treat that column as integer.

Or you can even imagine a 2 pass operation : first pass determine the type of the columns and second does the inserts.

benjwadams · Accepted Answer · 2014-06-19 20:20:36Z

I find Python's standard library functions a bit lacking for processing CSV files. I prefer to work with pandas when possible.

import xlwt
from pandas.io.parsers import read_csv

df = read_csv('file.csv')
#number the columns sequentially
df.columns = [i for i, e in enumerate(df.columns)]
#unstack the columns to make 2 indices plus a column, make row come before col,
#sort row major order, and then unset the indices to get a DataFrame
newDf = df.unstack().swaplevel(0,1).sort_index().reset_index()
#rename the cols to reflect the types of data
newDf.columns = ['row', 'col', 'value']
#write to excel
newDf.to_excel('output.xls', index=False)

This will also keep the row and column numbers as integer values values. I took an example csv file and row and col both were integer valued, not string.

Collectives™ on Stack Overflow

Read CSV without string formatting in python

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related