Reformatting Excel spreadsheet in Python

Question

Hi everyone, I received data in a excel (xls) spreadsheet that is formatted in the first table, illustrated above.

I am attempting to rearrange this data into the format, in the table, just below. Any help would be greatly appreciated.

Thanks much.

it's on topic. I'm asking for help on how to reformat the data using python. — COCO
– COCO, Commented Dec 16, 2015 at 18:27
@Cilyan wants you to follow the question asking guide, to which he should have really posted a link in his comment — inspectorG4dget
– inspectorG4dget, Commented Dec 16, 2015 at 18:30

inspectorG4dget · Accepted Answer · 2015-12-16 18:30:23Z

1

First, save it to a .csv file

import csv

curr = []
with open('file.csv') as infile, open('path/to/output', 'w') as fout:
    outfile = csv.writer(fout)
    for area, pop10, pop20, pop50 in csv.reader(infile):
        if curr and curr[0] != area:
            outfile.writerow(curr)
            curr = [area, pop10, pop20, pop50]
            continue

        if pop10: curr[1] = pop10
        if pop20: curr[2] = pop20
        if pop50: curr[3] = pop50

answered Dec 16, 2015 at 18:30

inspectorG4dget

115k30 gold badges159 silver badges253 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

COCO Over a year ago

starting to make sense. are area,pop10,pop20,pop50 tuples being returned from the reader object? or are these being substituted for the field names?

inspectorG4dget Over a year ago

@COCO: it seems that you are asking me about tuple assignment

COCO Over a year ago

Inspector - this works in csv. thanks. is there any way to perform the reformat in .xls, using xlwt?

inspectorG4dget Over a year ago

@COCO: there's probably a way to do this with xlwt, but I'm not very well versed with that. Sorry, but this the best I can come up with right now

COCO Over a year ago

inspector, so if im understanding correctly the tuples here are the column names..so technically if I had a spreadsheet with more columns..i could do something like for a,b,c,d,e,f,g in csv.reader(file) ?

|

Martin Valgur · Accepted Answer · 2015-12-16 18:45:27Z

1

You can do this pretty succinctly using Pandas:

import pandas as pd
dataframe = pd.read_excel("in.xlsx")
merged = dataframe.groupby("AREA").sum()
merged.to_excel("out.xlsx")

answered Dec 16, 2015 at 18:45

Martin Valgur

6,4071 gold badge40 silver badges49 bronze badges

1 Comment

COCO Over a year ago

apparently pandas is not in my out of the box installation and i need admin permission to escalate the matter and get it installed. boo.

COCO · Accepted Answer · 2015-12-16 22:08:59Z

so, if the csv has 11 columns where 'AREA' is the second column, would the code be:

def CompressRow(in_csv,out_file):
    curr = []
    with open(in_csv) as infile, open(out_file, 'w') as fout:
        outfile = csv.writer(fout)
        for a,b,c,d,e,f,g,h,i,j,k in csv.reader(infile):
            if curr and curr[1] != b:
                outfile.writerow(curr)
                curr = [a,b,c,d,e,f,g,h,i,j,k]
                continue

            if a: curr[0] = a
            if c: curr[2] = c
            if d: curr[3] = d
            if e: curr[4] = e
            if f: curr[5] =f
            if g: curr[6]=g
            if h: curr[7]=h
            if i: curr[8]=i
            if j: curr[9]=j
            if k: curr[10]=k

#execute CompressRow(in_csv,out_file)

I tried executing it and it gives me

if a: curr[0]=a

 IndexError: list assignment index out of range

Collectives™ on Stack Overflow

Reformatting Excel spreadsheet in Python

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

6 Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related