0

Using the following data:

Distance <-  data.frame(
 DAY    = c("1", "2","3")
,TEMP   = c(25, 27, 26.5)
,C1Dist01    = c(1, 1, 1)
,C2Dist01    = c(1, 1, 0)
,C3Dist01   = c(1, 0,0)
,C4Dist01   = c(1, 0, 0)
)

I am trying to sum across the columns C1Dist01, C2Dist01, C3Dist01, and C4Dist01 and create a new column with that output. The code I am currently using is:

SumDistance<-Distance %>%
  rowwise() %>%
  mutate(CowSum = sum(across(select(ends_with("Dist01"))), na.rm = T))

Weirdly, this code used to work, but when I tried running it today, I kept getting the error message:

Error in `across()`:
! Must only be used inside data-masking verbs like `mutate()`,
  `filter()`, and `group_by()`.
Backtrace:
 1. Distance %>% rowwise() %>% ...
 2. plyr::mutate(...)
 3. base::eval(cols[[col]], .data, parent.frame())
 4. base::eval(cols[[col]], .data, parent.frame())
 5. dplyr::across(select(ends_with("Dist01")))

I can't figure out what is causing this error message, so any insight would be appreciated!

I've tried functions other than the across() function and the select() function, but I've gotten the same problem. It's especially confusing since the error message specifies that I need to be using across() within mutate() even though I already am.

1
  • rowwise is very slow, consider using rowSums or Rfast::rowsums as suggested by @Axeman Commented Jan 21 at 19:52

1 Answer 1

1

across() pattern generally looks like mutate(across(<tidy-select>, ~ ...)), you may have meant c_across() instead:

library(dplyr)

Distance <-  data.frame(
  DAY    = c("1", "2","3")
  ,TEMP   = c(25, 27, 26.5)
  ,C1Dist01    = c(1, 1, 1)
  ,C2Dist01    = c(1, 1, 0)
  ,C3Dist01   = c(1, 0,0)
  ,C4Dist01   = c(1, 0, 0)
)

Distance %>%
  rowwise() %>%
  mutate(CowSum = sum(c_across(ends_with("Dist01")), na.rm = T))
#> # A tibble: 3 × 7
#> # Rowwise: 
#>   DAY    TEMP C1Dist01 C2Dist01 C3Dist01 C4Dist01 CowSum
#>   <chr> <dbl>    <dbl>    <dbl>    <dbl>    <dbl>  <dbl>
#> 1 1      25          1        1        1        1      4
#> 2 2      27          1        1        0        0      2
#> 3 3      26.5        1        0        0        0      1

Though in this case you could use rowSums() with pick() to skip rowwise():

Distance %>%
  mutate(CowSum = rowSums(pick(ends_with("Dist01")), na.rm = TRUE))
#>   DAY TEMP C1Dist01 C2Dist01 C3Dist01 C4Dist01 CowSum
#> 1   1 25.0        1        1        1        1      4
#> 2   2 27.0        1        1        0        0      2
#> 3   3 26.5        1        0        0        0      1

Created on 2025-01-21 with reprex v2.1.1

Sign up to request clarification or add additional context in comments.

4 Comments

Because I am running this on a very large dataset, rowSums() takes a long time to run. Do you know if there are any faster commands?
rowSums should be much faster that using rowwise(). Make sure you don't have any grouping set. You can use Rfast::rowsums() to improve performance also.
You could also try to convert relevant subset to matrix first, something like Distance$CowSum <- Distance[grep("Dist01$", names(Distance))] |> as.matrix() |> rowSums()
"Do you know if there are any faster commands?": Convert your dataset to long format and then use standard group_by and summarise methodology might be quicker. You'll have to investigate performance in your specific use case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.