1

I'm am trying to become more familiar with purrr rather than using for loops. I have some toy data (from a real problem) where I have several clinical visit values in one cell. I need to split these into their own columns and reshape the resulting data frame into long format, binding the columns together in the process. The code that does this successfully using a for loop is shown below, but I can't get map_df to work. I presume I will need to amend the function - which I have tried to no avail. Any help would be appreciated.

library(tidyverse)
# Toy data
dat <-  structure(list(id = 1:6, 
                       var_1 = c("1,1,-11000", "1,1,1", "0,0,0", "1,1,0", "1,1,1", "1,1,1"), 
                       var_2 = c("0,0,-13000", "0,0,0", "-13000,-13000,-13000", "6,4,-13000", "0,0,0", "0,0,0"), 
                       var_3 = c("24,7,-13000", "0,0,0", "-13000,-13000,-13000", "0,0,-13000", "0,0,0", "0,0,0")), 
                  row.names = 1:6, class = "data.frame")

# Separate to wide and convert to long in one step using a function
split_to_long <-  function(col){
  i <-  substr(col, 5, 5)
  temp <-  dat |>
    select("id", {{col}}) |>
    separate_wider_delim({{col}}, ",", too_few = "align_start",
                         names = c(paste0({{col}},"_1"),
                                   paste0({{col}},"_2"),
                                   paste0({{col}},"_3"))) |>
    pivot_longer(2:4,
                 names_to = "visit",
                 values_to = paste0("var_", i),
                 names_prefix = paste0("var_", i, "_"))
  temp
}
# Assemble using loop (only need to keep id and visit data from the first variable)
dat_long <-  split_to_long("var_1")
for(i in 2:3) {
  dat_long <-  cbind(dat_long, split_to_long(paste0("var_", i))[3])
}

# How to assemble using purrr?
map_dfr(dat, split_to_long)
4
  • 1
    fyi, you can get the desired results without any loop, e.g dat %>% mutate(visit = "1,2,3", .after=id) %>% separate_longer_delim(delim=",", -id) Commented Sep 27, 2024 at 1:51
  • 4
    dat %>% separate_longer_delim(delim=",", -id) %>% mutate(visit=row_number(), .by=id) might be better. Commented Sep 27, 2024 at 4:00
  • Thanks for this succint solution - I didn't notice separate_longer_delim. I'm still interested, however, in a solution to my original question - for learning purposes. Commented Sep 28, 2024 at 0:14
  • The reason map isn't working is because you're using each iteration to update the previous iteration, but map lives in it's own world (i.e., R environment). If you wanted to iteratively update the data, then you could send each map iterations' output to the the parent environment (which is what a for statement uses). For example map(2:3, \(i) { dat_long <<- cbind(dat_long, split_to_long(paste0("var_", i))[3])}) (Note the <<- to access the parent environment.) Now you can just call dat_long and see it's changed. Commented Sep 30, 2024 at 23:56

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.