0

I want to create a simplified way of recoding the same variable (the same way) across multiple data frames. For example, right now I'm re-coding an age variable from state datasets FL and GA. I'm currently coding them separately. How can I condense this code?

FL <- FL %>% 
  mutate(
    # Create categories
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    # Convert to factor
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )

GA <- GA %>% 
  mutate(
    # Create categories
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    # Convert to factor
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )
3
  • You can use cut(age, breaks = c(-Inf, 18, 29, 39, 49, 64, Inf)). Better would be to create a function i.e. age_grp_fn <- function(age) cut(age, breaks = c(-Inf, 18, 29, 39, 49, 64, Inf)) and reuse the function on each dataset Commented Feb 27, 2023 at 17:05
  • To avoid rewriting the code, you can put it into a function, then call it or apply it to all your data.frames Commented Feb 27, 2023 at 17:08
  • Do you have sample code? I'm struggling Commented Feb 27, 2023 at 17:25

1 Answer 1

0

We can call the same function as the argument to a loop-function

First, put all your data.frames in a list (several methods for that, hard to tell which is best without a proper reproducible example). An example:

my_dfs <- list(FL, GA)

Then define your function:

my_function <- function(x) x %>% 
  mutate(
    age_group = dplyr::case_when(
                              age >= 18 & age <= 29 ~ "18-29",
                              age >= 30 & age <= 39 ~ "30-39",
                              age >= 40 & age <= 49 ~ "40-49",
                              age >= 50 & age <= 64 ~ "50-64", 
                              age >= 65 ~ "65+"),
    age_group = factor(
      age_group,
      level = c("18-29", "30-39","40-49", "50-64","65+")
    )
  )

And finally call it in a loop:

lapply(my_dfs, my_function)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.