4

I have a list of list of lists that I need to write to a csv file.

mylist = [['Siddharth','Bangalore','[email protected]'],  
        ['Rahul','Bangalore','[email protected]'],.....and so on]  

This list is usually some 20,000 to 40,000 long.
So, right now, the only way to write them to a csv file is to iterate over the list and write:

fileObj = open("/home/siddharth/sample.csv", "wb")
csv_file = csv.writer(fileObj)  
for item in mylist:  
    csv_file.writerow(item)  

So, I just to wanted to know, is there a way to write such a list of lists to csv, without iterating over each item in the list, eg. like using StringIO etc.
Or, can anyone give some tip/hint, so that I can create my own implementation.

5
  • 3
    If you're looking to improve performance, then that should be your question. Don't assume that not looping is an automatic performance improvement. You can't eliminate the loop. Looping will happen somewhere, even if non-obvious. Attempts to eliminate the loop will likely result in the entire contents of the file being handled in memory at once, which is likely to be even less efficient. Commented Dec 27, 2012 at 13:00
  • Right! thanks a lot for the tip. Commented Dec 27, 2012 at 13:06
  • run python -mcProfile your_script.py to find out which functions take the most time. Commented Dec 27, 2012 at 13:06
  • @MichaelMior - "the entire contents of the file being handled in memory at once, which is likely to be even less efficient" - where do you get that this is likely to be less efficient? Moving disk i/o to memory to reduce numbers of disk accesses is a common performance optimization. And I don't think 40,000 elements of 40-60 characters each will put much strain on most computer systems today. Commented Dec 27, 2012 at 13:58
  • Python already uses buffered I/O, so disk writes won't happen immediately. And since the goal is to write to file, doing more in memory before beginning the write is likely just to add more of a delay. Commented Dec 27, 2012 at 19:42

3 Answers 3

7

There is a writerows method which will add all the rows in an iterable:

csv_file.writerows(the_list)

Whatever you do, there will always be a loop somewhere (either in your code or in the Python library implementation). There is no way around it as you need to look at each item in the list so that you can write them to the file.

In case you're worried about the performance of writing each line to the file separately: Python uses buffered I/O by default, so even if you write the list items one by one in a loop, they won't necessarily be written like that to the file. They will be written in chunks whenever the buffer fills up, which will have better performance. If needed, you can explicitly control the buffer size by using the buffering parameter of open.

Sign up to request clarification or add additional context in comments.

5 Comments

Yeah, as i said, it also uses a for loop to iterate over the list. Which is eventually the same thing that I do.
@SiddharthSrivastava: There has to be a loop somewhere since you need to iterate over the list in order to get its contents.
yes right. Was just wondering, if something can be done. I know I sound weird when I say so, but maybe something can be done.
And, thanks for the information regarding, buffered I/O. I wasn't aware of it.
This seems to transpose my data - any way to avoid this?
3

Easy:

csv_file.writerows(mylist)

(Note the s in writerows())

9 Comments

No, if you see the source code for csvfile.writerows() , the you'll find that it also iterates over the list. Just go through it.
@SiddharthSrivastava: Well, how else would it write it to the file? Someone has to do the iteration, so it might as well be Python.
except that It just moves the loop inside csv module. There is no way to write a 2d list as csv without looking at each item at some point (iterating over the list)
Yeah, I know that is the most obvious way. That is why I asked, if there is any special approach, or a workaround that. Like maybe using StringIO.
@Sebastian, yes you are right, I was just wondering, if we can figure out something :)
|
0

These are all tuples of strings? Only 40,000 of them? Have you tried this?

with open('outputfile.csv','w') as csv_file:
    csv_file.write('\n'.join(map(','.join, my_list)+'\n'))

You are still iterating over the list, but doing so using map, which is implemented in Python's C library.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.