Comments on On Clouds, Poems, Python and more...: Celebrating the first year of IPython logging

GÖKHAN, you still updating your blog? - Ed from o...

2010-06-15T09:00:06.450-07:00

GÖKHAN, you still updating your blog?

- Ed from over the pond

...forgot to say, but to procdess all individual f...

2010-06-02T07:39:15.655-07:00

...forgot to say, but to procdess all individual files at once to a single output files, try:

$ cat *.log | perl -ne 'print unless $s{$_}++' > filename

Output unique lines sans reorder: $ perl -ni.ori...

2010-06-02T07:35:11.223-07:00

Output unique lines sans reorder:

$ perl -ni.orig -e 'print unless $s{$_}++' filename

I like Python, but Perl is handy to have around...

Thanks for the info!

Thanks RL. This is the solution that I have been l...

2010-04-21T20:58:17.947-07:00

Thanks RL. This is the solution that I have been looking for. My history file has shrunk down to ~14K lines from ~38, without losing its original order and replicates removed.

It's actually easy: iterate over the file, kee...

2010-04-21T20:00:20.301-07:00

It's actually easy: iterate over the file, keep a set of the lines you've seen so far and write only those you haven't:

seen = set()
with open('input.txt') as input:
with open('output.txt', 'w') as out:
for line in input:
if line not in seen:
seen.add(line)
out.write(line)

a = open("history", "r").readl...

2010-04-18T15:16:40.319-07:00

a = open("history", "r").readlines()
b = set(a)

does the same as yours. However neither with your solution nor with the set I couldn't write the result preserving the order I see on screen using IPython.

h = open("new", "w")
h.writelines(D) # or h.writelines(b)

They show the same, write the same but in a different order than what is listed. It would be nice to preserve the order in the original file.

Duplicate line numbers, quick & dirty, conside...

2010-04-18T12:01:12.438-07:00

Duplicate line numbers, quick & dirty, considering 'older' lines as duplicates:

D = dict()
dupes = list()

for n, line in enumerate(open('pattern.txt', 'r').readlines()):
if D.get(line): dupes.append(D[line])
D[line] = n