Aggregate rows with string values in R

Question

I have a dataframe df with only string values. I need to aggregate these rows on idand session and fill the NA values. My original dataframe has 50 columns but this is just an example dataframe. You can assume that for each combination of id and session the values (string1 or string2) are the same, if they don't have an NA value.

session <- c('s1', 's1', 's1', 's2', 's2', 's2')
string1 <- c('first_string1', NA, 'first_string1', NA, 'first_string3', NA)
string2 <- c(NA, 'second_string2', 'second_string2', 'second_string4', NA, NA)
df <- data.frame(id, session, string1, string2)

df

  id session       string1        string2
1  a      s1 first_string1           <NA>
2  a      s1          <NA> second_string2
3  a      s1 first_string1 second_string2
4  b      s2          <NA> second_string4
5  b      s2 first_string3           <NA>
6  b      s2          <NA>           <NA>

The final dataframe should look like this:

  id session       string1        string2
1  a      s1 first_string1 second_string2
2  b      s2 first_string3 second_string4

I have tried to using the aggregate function but I can't figure out how to get this working

What is the logic by which first_string3 is associated with second_string4, given that neither are actually adjacent to each other anywhere in the starting data frame? — Tim Biegeleisen
– Tim Biegeleisen, Commented Jul 18, 2019 at 11:14

score 2 · Accepted Answer · 2019-07-18 12:18:14Z

2

With aggregate you can do something like this, where you include a function that removes NAs and finds unique rows while aggregating:

aggregate(df[c("string1", "string2")],
          by = list(id = id, session = session),
          function(x) unique(na.omit(x)))

#### OUTPUT ####

  id session       string1        string2
1  a      s1 first_string1 second_string2
2  b      s2 first_string3 second_string4

Base R's merge is another, perhaps slightly easier to understand, option:

unique(na.omit(merge(df[c("id", "session", "string1")],
                     df[c("id", "session", "string2")],
                     by = c("id", "session")
                     )))

#### OUTPUT #### 

  id session       string1        string2
1  a      s1 first_string1 second_string2
2  b      s2 first_string3 second_string4

edited Jul 18, 2019 at 12:18

answered Jul 18, 2019 at 11:40

user10191355

Sign up to request clarification or add additional context in comments.

Comments

Humpelstielzchen · Accepted Answer · 2019-07-18 12:04:20Z

1

Another option is:

library(dplyr)

df %>%
  group_by(id, session) %>%
  summarise_at(vars(starts_with("string")), ~unique(na.omit(.)))

# A tibble: 2 x 4
# Groups:   id [2]
  id    session string1       string2       
  <chr> <chr>   <chr>         <chr>         
1 a     s1      first_string1 second_string2
2 b     s2      first_string3 second_string4

A base R solution

aggregate(cbind(string1, string2) ~ id + session, data = df, function(x) unique(na.omit(x)), na.action = na.pass)

  id session       string1        string2
1  a      s1 first_string1 second_string2
2  b      s2 first_string3 second_string4

edited Jul 18, 2019 at 12:04

answered Jul 18, 2019 at 11:53

Humpelstielzchen

6,4613 gold badges17 silver badges36 bronze badges

Comments

denisafonin · Accepted Answer · 2019-07-18 11:49:59Z

0

A bit clunky, but works:

library(tidyverse)

df %>% 
  group_by (id, session) %>%
  summarise(string1 =  paste(unique(string1[!is.na(string1)]), collapse = ""),
            string2 =  paste(unique(string2[!is.na(string2)]), collapse = ""))

Output:

id    session string1       string2       
  <fct> <fct>   <chr>         <chr>         
1 a     s1      first_string1 second_string2
2 b     s2      first_string3 second_string4

edited Jul 18, 2019 at 11:49

answered Jul 18, 2019 at 11:37

denisafonin

1,1361 gold badge7 silver badges17 bronze badges

Collectives™ on Stack Overflow

Aggregate rows with string values in R

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related