Error in "across( )" when summing multiple columns in R

Question

Using the following data:

Distance <-  data.frame(
 DAY    = c("1", "2","3")
,TEMP   = c(25, 27, 26.5)
,C1Dist01    = c(1, 1, 1)
,C2Dist01    = c(1, 1, 0)
,C3Dist01   = c(1, 0,0)
,C4Dist01   = c(1, 0, 0)
)

I am trying to sum across the columns C1Dist01, C2Dist01, C3Dist01, and C4Dist01 and create a new column with that output. The code I am currently using is:

SumDistance<-Distance %>%
  rowwise() %>%
  mutate(CowSum = sum(across(select(ends_with("Dist01"))), na.rm = T))

Weirdly, this code used to work, but when I tried running it today, I kept getting the error message:

Error in `across()`:
! Must only be used inside data-masking verbs like `mutate()`,
  `filter()`, and `group_by()`.
Backtrace:
 1. Distance %>% rowwise() %>% ...
 2. plyr::mutate(...)
 3. base::eval(cols[[col]], .data, parent.frame())
 4. base::eval(cols[[col]], .data, parent.frame())
 5. dplyr::across(select(ends_with("Dist01")))

I can't figure out what is causing this error message, so any insight would be appreciated!

I've tried functions other than the across() function and the select() function, but I've gotten the same problem. It's especially confusing since the error message specifies that I need to be using across() within mutate() even though I already am.

rowwise is very slow, consider using rowSums or Rfast::rowsums as suggested by @Axeman — Onyambu
– Onyambu, Commented Jan 21 at 19:52

margusl · Accepted Answer · 2025-01-21 19:37:58Z

1

across() pattern generally looks like mutate(across(<tidy-select>, ~ ...)), you may have meant c_across() instead:

library(dplyr)

Distance <-  data.frame(
  DAY    = c("1", "2","3")
  ,TEMP   = c(25, 27, 26.5)
  ,C1Dist01    = c(1, 1, 1)
  ,C2Dist01    = c(1, 1, 0)
  ,C3Dist01   = c(1, 0,0)
  ,C4Dist01   = c(1, 0, 0)
)

Distance %>%
  rowwise() %>%
  mutate(CowSum = sum(c_across(ends_with("Dist01")), na.rm = T))
#> # A tibble: 3 × 7
#> # Rowwise: 
#>   DAY    TEMP C1Dist01 C2Dist01 C3Dist01 C4Dist01 CowSum
#>   <chr> <dbl>    <dbl>    <dbl>    <dbl>    <dbl>  <dbl>
#> 1 1      25          1        1        1        1      4
#> 2 2      27          1        1        0        0      2
#> 3 3      26.5        1        0        0        0      1

Though in this case you could use rowSums() with pick() to skip rowwise():

Distance %>%
  mutate(CowSum = rowSums(pick(ends_with("Dist01")), na.rm = TRUE))
#>   DAY TEMP C1Dist01 C2Dist01 C3Dist01 C4Dist01 CowSum
#> 1   1 25.0        1        1        1        1      4
#> 2   2 27.0        1        1        0        0      2
#> 3   3 26.5        1        0        0        0      1

^{Created on 2025-01-21 with reprex v2.1.1}

edited Jan 21 at 19:37

answered Jan 21 at 19:27

margusl

21.6k3 gold badges23 silver badges33 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

shrimp Jan 21 at 19:42

Because I am running this on a very large dataset, rowSums() takes a long time to run. Do you know if there are any faster commands?

Axeman Jan 21 at 19:43

rowSums should be much faster that using rowwise(). Make sure you don't have any grouping set. You can use Rfast::rowsums() to improve performance also.

margusl Jan 21 at 19:52

You could also try to convert relevant subset to matrix first, something like Distance$CowSum <- Distance[grep("Dist01$", names(Distance))] |> as.matrix() |> rowSums()

Limey Jan 23 at 12:44

"Do you know if there are any faster commands?": Convert your dataset to long format and then use standard group_by and summarise methodology might be quicker. You'll have to investigate performance in your specific use case.

Collectives™ on Stack Overflow

Error in "across( )" when summing multiple columns in R

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related