2

I have a dataframe with date, id - I need to pull out each date and id combination and create a new dataframe.

  date           id
2016-05-13       abc
2016-05-13       pqr
2016-05-14       abc
2016-05-14       pqr

ids = list(sorted(set(df['id'])))
Out: ['abc','pqr']

dates = list(sorted(set(df[df.id == ids[i]]['date'])))
Out: ['2016-05-13','2016-05-14']

for i in range(0,len(ids)):
    df2 = df[(df.date == dates[i]) & (df.id == id[i])]

The above code is resulting the output (df2) for relative index values only (First date, First Id & Second date, Second Id), but I need the output for all the pairs. Please let me know what to change in the loop?

2 Answers 2

0

Create a new dataframe with each id in columns and each date in rows. You can fill it in later.

pd.DataFrame([], set(df.date), set(df.id))

enter image description here


if you just want the list of combinations

pd.MultiIndex.from_product([set(df.id), set(df.date)]).tolist()

[('pqr', '2016-05-14'),
 ('pqr', '2016-05-13'),
 ('abc', '2016-05-14'),
 ('abc', '2016-05-13')]
Sign up to request clarification or add additional context in comments.

Comments

-2

to get all the pairs ids vs. dates, you could use itertools as

import itertools

for iid, ddate in itertools.product(ids, dates):
    df2 = df[(df.date == ddate) & (df.id == iid)]

2 Comments

itertools is a fine answer, but this snippet doesn't even work. You just reassign some slice of df to the same variable df2 over and over again.
@piRSquared from what I understood, the OP wanted to generate such slice for each pair of id/date - slice which is presumably subject to further processing...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.