1

I have a pandas dataframe with the column fert_Rate for fertility rate. I want to have a new column with these values as categorical instead of numerical. Instead of 1.0, 2.5, 4.0 I want something like (low, medium, high). In R I would have written something like this:

attach(mydata)
mydata$fertcat[fert_Rate > 3.5] <- "High"
mydata$fertcat[fert_Rate > 2 & fert_Rate <= 3.5] <- "Medium"
mydata$fertcat[fert_Rate <= 2] <- "Low"
detach(mydata)

Is there a similar way to do it in python or should I just loop over the column to create?

1 Answer 1

7

Use pd.cut to bin your data.

df = pd.DataFrame({'fert_Rate': [1, 2, 3, 3.5, 4, 5]})
>>> df.assign(fertility=pd.cut(df['fert_Rate'], 
                               bins=[0, 2, 3.5, 999], 
                               labels=['Low', 'Medium', 'High']))
   fert_Rate fertility
0        1.0       Low
1        2.0       Low
2        3.0    Medium
3        3.5    Medium
4        4.0      High
5        5.0      High
Sign up to request clarification or add additional context in comments.

1 Comment

Perfect! This is exactly what I needed. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.