I have a dataset with nine cities. I trained and tested four different machine learning models for each city. The results are in the tibble below:
set.seed(1)
result <-
tibble::tibble(city = letters[1:9],
m1_train = runif(9),
m1_test = runif(9),
m2_train = runif(9),
m2_test = runif(9),
m3_train = runif(9),
m3_test = runif(9),
m4_train = runif(9),
m4_test = runif(9))
result
#> # A tibble: 9 × 9
#> city m1_train m1_test m2_train m2_test m3_train m3_test m4_train m4_test
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 a 0.266 0.0618 0.380 0.382 0.794 0.789 0.0707 0.332
#> 2 b 0.372 0.206 0.777 0.870 0.108 0.0233 0.0995 0.651
#> 3 c 0.573 0.177 0.935 0.340 0.724 0.477 0.316 0.258
#> 4 d 0.908 0.687 0.212 0.482 0.411 0.732 0.519 0.479
#> 5 e 0.202 0.384 0.652 0.600 0.821 0.693 0.662 0.766
#> 6 f 0.898 0.770 0.126 0.494 0.647 0.478 0.407 0.0842
#> 7 g 0.945 0.498 0.267 0.186 0.783 0.861 0.913 0.875
#> 8 h 0.661 0.718 0.386 0.827 0.553 0.438 0.294 0.339
#> 9 i 0.629 0.992 0.0134 0.668 0.530 0.245 0.459 0.839
In this tibble m1_train is the RMSE obtained by model 1 for the train set, m1_test is the RMSE obtained by model 1 for the test set and so on.
I'd like to create two new columns in my tibble:
min_trainis the minimum RMSE only for the columns that end with_trainmin_testis the minimum RMSE only for the columns that end with_test
I've been trying too many different approaches (rowwise(), mutate(vars(ends_with("_train"))) and others), without success.
How can I approach his problem?
min_train = do.call(pmin, result[endsWith(names(result), 'train')]); min_test = do.call(pmin, result[endsWith(names(result), 'test')]). This task does not require any external libraries.dplyr(due to syntax) no matter the cost. I recommend to never userowwise(); when going withdplyr::mutateI would opt formutate(min_train = Rfast::rowMins(as.matrix(across(ends_with('train'))), value=TRUE), ...)rowwisestate "[...] This is most useful when a vectorised function doesn't exist. [...]".