1

I have a three-column dataset df1 with this data:

ID  Age Games
1   36  10      
2   36  15      
3   36  20      
4   36  30      
1   37  40      
2   37  50      
3   37  35      
4   37  45      

Here is the dput for the df1 dataset:

structure(list(ID = c(1, 2, 3, 4, 1, 2, 3, 4), Age = c(36, 36, 
36, 36, 37, 37, 37, 37), Games = c(10, 15, 20, 30, 40, 50, 35, 
45)), class = "data.frame", row.names = c(NA, -8L))

I want the data to be appear the way shown in the table below where the games for each ID's Age 36 and Age 37 values are now in separate columns:

ID  Age36 Age37
1    10     0      
2    15     0      
3    20     0      
4    30     0      
1     0    40      
2     0    50      
3     0    35      
4     0    45

My goal is to create from the second table a line chart with two line graphs, one for the Age36 games for each ID and the other for the Age 37 games, using pivot_longer to prepare the data for input into ggplot2 (unless you can recommend a better way of doing that).

6
  • 6
    I think the first dataset is better for ggplot instead of the second Commented Feb 23, 2020 at 17:42
  • 1
    Try df1 %>% mutate(Age = factor(str_c('Age', Age))) %>% ggplot(aes(x = ID, y = Games, group = Age)) + geom_line() Commented Feb 23, 2020 at 17:44
  • @akrun, how does the code in the parentheses work: mutate(Age = factor(str_c('Age', Age)? Commented Feb 23, 2020 at 18:00
  • 1
    It is just to create a factor column with 'Age' as prefix. The packages used are library(dplyr); library(stringr) Commented Feb 23, 2020 at 18:00
  • 4
    Because it is already in the 'long' format. In the second one, it is converted to a pseudo long format with columns as wide Commented Feb 23, 2020 at 18:08

1 Answer 1

1

Simply follow this code

library(tidyverse)

df %>% mutate(row = row_number()) %>%
  pivot_wider(id_cols = c(row, ID), names_from = Age, values_from = Games, values_fill = 0L) %>%
  select(-row)

# A tibble: 8 x 3
     ID  `36`  `37`
  <dbl> <dbl> <dbl>
1     1    10     0
2     2    15     0
3     3    20     0
4     4    30     0
5     1     0    40
6     2     0    50
7     3     0    35
8     4     0    45

Explanation: Since your ID column is not unique we have to create a dummy ID column by using row_number() and in last we can unselect it.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.