for loops in R are generally considered slow: it's hard to avoid unintended memory read/writes. But how to replace a nested for loop? Which is the best approach?
Please note that this is a generic question: the f function below is just an example, it could be much more complicated or return different objects. I just want to see all the different approaches that one can take in R to avoid nested for loops.
Consider this as an example:
al <- c(2,3,4)
bl <- c("foo", "bar")
f <- function(n, c) { #Just one simple example function, could be much more complicated
data.frame(n=n, c=c, val=n*nchar(c))
}
d <- data.frame()
for (a in al) {
for (b in bl) {
d <- rbind(d, f(a, b))
#one could undoubtedly do this a lot better
#even keeping to nested for loops
}
}
One could replace it in this absolutely horrible way (take this only as a crude example):
eg <- expand.grid(al, bl)
d <- do.call(rbind,
lapply(1:dim(eg)[1],
function(i) {f(as.numeric(eg[i,1]), as.character(eg[i, 2]))}
)
)
or using library(purrr), which is a little bit less inelegant:
d <- map_dfr(bl, function(b) map2_dfr(al, b, f))
... there are countless different methods. Which one is the simplest, and which one the fastest?
Here is a very quick evaluation of the performance of the three previous methods on my laptop:
