How to collapse dataset and create new variables?

Question

I have a dataset with multiple rows of data for each subject. This dataset is very long, and each person has dozens of rows of each session they are in. I want to collapse this dataset into a more manageable format. The current dataset looks something like this:

name	session	personal vote	party vote	Voted with party?
John	114	yea	nay	no
John	114	yea	yea	yes
John	114	nay	nay	yes
John	115	yea	yea	yes
John	116	yea	nay	no
John	116	yea	yea	yes
Tim	114	nay	yea	no
Tim	115	nay	yea	no
Tim	115	nay	nay	yes
Tim	116	yea	nay	no

I want to use this data to create a new dataframe where each person gets one row for each session. In this new dataset I want to create two new variables. The first, votecount, is just a raw count of the number of votes that person had in that session. The second new variable, party loyalty, should indicate what proportion of the person's votes were aligned with the how the party voted. It would look something like this:

name	session	votecount	party loyalty
John	114	3	.66
John	115	1	1
John	116	2	.5
Tim	114	1	0
Tim	115	2	.5
Tim	116	1	0

Any help would be greatly appreciated!

Maël · Accepted Answer · 2023-10-05 14:50:22Z

With dplyr, use n to count the number of votes, and mean to get the loyalty measures, grouped by name and session:

library(dplyr)
df |> 
  summarise(votecount = n(),
            party_loyalty = mean(`Voted with party?` == "yes"), 
            .by = c(name, session))

#   name session votecount party_loyalty
# 1 John     114         3     0.6666667
# 2 John     115         1     1.0000000
# 3 John     116         2     0.5000000
# 4  Tim     114         1     0.0000000
# 5  Tim     115         2     0.5000000
# 6  Tim     116         1     0.0000000

reproducible data:

df <- read.table(h=T,text="name session 'personal vote' 'party vote'    'Voted with party?'
John    114 yea nay no
John    114 yea yea yes
John    114 nay nay yes
John    115 yea yea yes
John    116 yea nay no
John    116 yea yea yes
Tim 114 nay yea no
Tim 115 nay yea no
Tim 115 nay nay yes
Tim 116 yea nay no", check.names = FALSE)

Collectives™ on Stack Overflow

How to collapse dataset and create new variables?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related