0

Suppose we have a DataFrame:

data = {'person_id': ['person_a', 'person_a', 'person_b','person_b', 'person_c','person_c'],
        'categorical_data': ['new', 'new', 'ok', 'bad', 'new', 'bad']}
df = pd.DataFrame(data)

    person_id   categorical_data
0   person_a    new
1   person_a    new
2   person_b    ok
3   person_b    bad
4   person_c    new
5   person_c    bad

I want to expand the categorical data into multiple columns with counts of each category.

We can group by the person id to get counts:

count_categories = df.groupby('person_id')['categorical_data'].value_counts().reset_index(name='count')

person_id   categorical_data    count
0   person_a    new 2
1   person_b    bad 1
2   person_b    ok  1
3   person_c    bad 1
4   person_c    new 1

Then I tried this to create the new columns:

pivoted = count_categories.set_index(['person_id','categorical_data']).unstack('categorical_data')


count
categorical_data    bad new ok
person_id           
person_a    NaN 2.0 NaN
person_b    1.0 NaN 1.0
person_c    1.0 1.0 NaN

This is the form I want, but I'm confused by the MultiIndexing

How can I get rid of the index, or is there a better way to do this? trying reset index yields:

pivoted.reset_index() 

    person_id   count
categorical_data        bad new ok
0   person_a    NaN 2.0 NaN
1   person_b    1.0 NaN 1.0
2   person_c    1.0 1.0 NaN
0

1 Answer 1

0

Code

use crosstab

out = pd.crosstab(df['person_id'], df['categorical_data'])

out

categorical_data  bad  new  ok
person_id                     
person_a            0    2   0
person_b            1    0   1
person_c            1    1   0

Or

out1 = (pd.crosstab(df['person_id'], df['categorical_data'])
          .reset_index()
          .rename_axis(None, axis=1)
)

out1

    person_id   bad new ok
0   person_a    0   2   0
1   person_b    1   0   1
2   person_c    1   1   0

I don't know your exact desired output, if it's not both out and out1, plz provide it.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.