I'm am trying to become more familiar with purrr rather than using for loops. I have some toy data (from a real problem) where I have several clinical visit values in one cell. I need to split these into their own columns and reshape the resulting data frame into long format, binding the columns together in the process. The code that does this successfully using a for loop is shown below, but I can't get map_df to work. I presume I will need to amend the function - which I have tried to no avail. Any help would be appreciated.
library(tidyverse)
# Toy data
dat <- structure(list(id = 1:6,
var_1 = c("1,1,-11000", "1,1,1", "0,0,0", "1,1,0", "1,1,1", "1,1,1"),
var_2 = c("0,0,-13000", "0,0,0", "-13000,-13000,-13000", "6,4,-13000", "0,0,0", "0,0,0"),
var_3 = c("24,7,-13000", "0,0,0", "-13000,-13000,-13000", "0,0,-13000", "0,0,0", "0,0,0")),
row.names = 1:6, class = "data.frame")
# Separate to wide and convert to long in one step using a function
split_to_long <- function(col){
i <- substr(col, 5, 5)
temp <- dat |>
select("id", {{col}}) |>
separate_wider_delim({{col}}, ",", too_few = "align_start",
names = c(paste0({{col}},"_1"),
paste0({{col}},"_2"),
paste0({{col}},"_3"))) |>
pivot_longer(2:4,
names_to = "visit",
values_to = paste0("var_", i),
names_prefix = paste0("var_", i, "_"))
temp
}
# Assemble using loop (only need to keep id and visit data from the first variable)
dat_long <- split_to_long("var_1")
for(i in 2:3) {
dat_long <- cbind(dat_long, split_to_long(paste0("var_", i))[3])
}
# How to assemble using purrr?
map_dfr(dat, split_to_long)
dat %>% mutate(visit = "1,2,3", .after=id) %>% separate_longer_delim(delim=",", -id)dat %>% separate_longer_delim(delim=",", -id) %>% mutate(visit=row_number(), .by=id)might be better.separate_longer_delim. I'm still interested, however, in a solution to my original question - for learning purposes.mapisn't working is because you're using each iteration to update the previous iteration, butmaplives in it's own world (i.e., R environment). If you wanted to iteratively update the data, then you could send each map iterations' output to the the parent environment (which is what aforstatement uses). For examplemap(2:3, \(i) { dat_long <<- cbind(dat_long, split_to_long(paste0("var_", i))[3])})(Note the<<-to access the parent environment.) Now you can just calldat_longand see it's changed.