0

I have a data set like so

Date    Facility    Meas_1  Meas_2  Meas_3
1/1/2021    C   0   0   0
1/1/2021    Ge  0   1   0
1/1/2021    A   0   0   1
1/1/2021    A   0   0   0
1/1/2021    P   1   0   0
1/1/2021    C   0   0   0
1/1/2021    Ge  0   0   0
1/1/2021    P   0   0   0
1/1/2021    R   1   1   0
1/1/2021    C   0   0   0
1/2/2021    Ga  0   1   0
1/2/2021    C   0   0   0
1/2/2021    C   0   1   0
1/2/2021    A   1   0   0
1/2/2021    E   0   0   0

And need to find the sum of Meas_1, Meas_2, and Meas_3 based on the Value of Facility and the date. Facility is a Factor and the measures are binary, 1 being true 0 being false. I'm trying to get a count of each at each facility.

I've tried aggregate with no luck, thank you!

2
  • 1
    Are you looking for df %>% group_by(Date, Facility) %>% summarise(across(starts_with("Meas"), sum), .groups = "drop")? Commented Apr 25, 2022 at 22:40
  • 1
    aggregate(.~Date + Facility, df, mean) is the way to use aggregate Commented Apr 25, 2022 at 23:10

1 Answer 1

1

Base-R Solution

Very similar to onyambu's comment, but here the targeted column names are mentioned explicitly. To me, this way makes the code easier to understand:

#Your data
dat <- structure(list(Date = c("1/1/2021", "1/1/2021", "1/1/2021", "1/1/2021", 
                               "1/1/2021", "1/1/2021", "1/1/2021", "1/1/2021", "1/1/2021", "1/1/2021", 
                               "1/2/2021", "1/2/2021", "1/2/2021", "1/2/2021", "1/2/2021"), 
                      Facility = c("C", "Ge", "A", "A", "P", "C", "Ge", "P", "R", 
                                   "C", "Ga", "C", "C", "A", "E"), Meas_1 = c(0L, 0L, 0L, 0L, 
                                                                              1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L), Meas_2 = c(0L, 
                                                                                                                                      1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L), 
                      Meas_3 = c(0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
                                 0L, 0L, 0L)), class = "data.frame", row.names = c(NA, -15L
                                 ))


aggregate(cbind(Meas_1, Meas_2, Meas_3) ~ Date + Facility, dat, sum)

      Date Facility Meas_1 Meas_2 Meas_3
1 1/1/2021        A      0      0      1
2 1/2/2021        A      1      0      0
3 1/1/2021        C      0      0      0
4 1/2/2021        C      0      1      0
5 1/2/2021        E      0      0      0
6 1/2/2021       Ga      0      1      0
7 1/1/2021       Ge      0      1      0
8 1/1/2021        P      1      0      0
9 1/1/2021        R      1      1      0

data.table solution

To complement Martin Gal's dplyr solution, here is a data.table solution:

dt.dat <- as.data.table(dat)
dt.dat[,lapply(.SD,sum), .SDcols = c("Meas_1", "Meas_2", "Meas_3"), by =.(Date,Facility)]
       Date Facility Meas_1 Meas_2 Meas_3
1: 1/1/2021        C      0      0      0
2: 1/1/2021       Ge      0      1      0
3: 1/1/2021        A      0      0      1
4: 1/1/2021        P      1      0      0
5: 1/1/2021        R      1      1      0
6: 1/2/2021       Ga      0      1      0
7: 1/2/2021        C      0      1      0
8: 1/2/2021        A      1      0      0
9: 1/2/2021        E      0      0      0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.