2

enter image description here

Hi everyone, I received data in a excel (xls) spreadsheet that is formatted in the first table, illustrated above.

I am attempting to rearrange this data into the format, in the table, just below. Any help would be greatly appreciated.

Thanks much.

2
  • it's on topic. I'm asking for help on how to reformat the data using python. Commented Dec 16, 2015 at 18:27
  • @Cilyan wants you to follow the question asking guide, to which he should have really posted a link in his comment Commented Dec 16, 2015 at 18:30

3 Answers 3

1

First, save it to a .csv file

import csv

curr = []
with open('file.csv') as infile, open('path/to/output', 'w') as fout:
    outfile = csv.writer(fout)
    for area, pop10, pop20, pop50 in csv.reader(infile):
        if curr and curr[0] != area:
            outfile.writerow(curr)
            curr = [area, pop10, pop20, pop50]
            continue

        if pop10: curr[1] = pop10
        if pop20: curr[2] = pop20
        if pop50: curr[3] = pop50
Sign up to request clarification or add additional context in comments.

6 Comments

starting to make sense. are area,pop10,pop20,pop50 tuples being returned from the reader object? or are these being substituted for the field names?
@COCO: it seems that you are asking me about tuple assignment
Inspector - this works in csv. thanks. is there any way to perform the reformat in .xls, using xlwt?
@COCO: there's probably a way to do this with xlwt, but I'm not very well versed with that. Sorry, but this the best I can come up with right now
inspector, so if im understanding correctly the tuples here are the column names..so technically if I had a spreadsheet with more columns..i could do something like for a,b,c,d,e,f,g in csv.reader(file) ?
|
1

You can do this pretty succinctly using Pandas:

import pandas as pd
dataframe = pd.read_excel("in.xlsx")
merged = dataframe.groupby("AREA").sum()
merged.to_excel("out.xlsx")

1 Comment

apparently pandas is not in my out of the box installation and i need admin permission to escalate the matter and get it installed. boo.
0

so, if the csv has 11 columns where 'AREA' is the second column, would the code be:

def CompressRow(in_csv,out_file):
    curr = []
    with open(in_csv) as infile, open(out_file, 'w') as fout:
        outfile = csv.writer(fout)
        for a,b,c,d,e,f,g,h,i,j,k in csv.reader(infile):
            if curr and curr[1] != b:
                outfile.writerow(curr)
                curr = [a,b,c,d,e,f,g,h,i,j,k]
                continue

            if a: curr[0] = a
            if c: curr[2] = c
            if d: curr[3] = d
            if e: curr[4] = e
            if f: curr[5] =f
            if g: curr[6]=g
            if h: curr[7]=h
            if i: curr[8]=i
            if j: curr[9]=j
            if k: curr[10]=k

#execute CompressRow(in_csv,out_file)

I tried executing it and it gives me

if a: curr[0]=a

 IndexError: list assignment index out of range

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.