0

I'm trying to create a monthly time series in ggplot for time series analysis. This is my data:

rdata1 <- read_table2("date  sales_revenue_incl_credit 
                                    2017-07 56,037.46
                                    2017-08 38333.9
                                    2017-09 48716.92
                                    2017-10 65447.67
                                    2017-11 134752.57
                                    2017-12 116477.39
                                    2018-01 78167.25
                                    2018-02 75991.44
                                    2018-03 42520.93
                                    2018-04 70489.92
                                    2018-05 121063.35
                                    2018-06 76308.47
                                    2018-07 118085.7
                                    2018-08 96153.38
                                    2018-09 82827.1
                                    2018-10 109288.83
                                    2018-11 145774.52
                                    2018-12 141572.77
                                    2019-01 123055.83
                                    2019-02 104232.24
                                    2019-03 435086.33
                                    2019-04 74304.96
                                    2019-05 117237.82
                                    2019-06 82013.47
                                    2019-07 99382.67
                                    2019-08 138455.2
                                    2019-09 97301.99
                                    2019-10 137206.09
                                    2019-11 109862.44
                                    2019-12 118150.96
                                    2020-01 140717.9
                                    2020-02 127622.3
                                    2020-03 134126.09")

I now use the below code to change the class of date and then plot with breaks and labels much easier using date_labels and date_breaks.

rdata1 %>%
  mutate(date = ymd(date)) %>%
  ggplot(aes(date, sales_revenue_incl_credit)) +
  geom_line() +
  scale_x_date(date_labels = "%b %Y", date_breaks = "1 month")+
  theme_bw()+
  theme(axis.text.x = element_text(angle = 90, vjust=0.5), 
        panel.grid.minor = element_blank())

I get the following error:

Error in seq.int(r1$mon, 12 * (to0$year - r1$year) + to0$mon, by) : 'from' must be a finite number

3
  • 2
    It seems the ymd() function didn't pick up your dates properly. Try mutate(date = ymd(paste0(date, "-01"))). Commented Jun 18, 2020 at 13:26
  • 1
    +1 @teunbrand. Test ymd(rdata$date[1]) and you'll see you get NA as the result. Even if you specify via as.Date(rdata$date[1], format="%Y-%m")` it fails to work, since the Date format needs to specify day too. The suggestion would be to just add "-01" to the end of each day in your column and then ymd() will work and so would the as.Date() function if you specify format="%Y-%m-%d"). Commented Jun 18, 2020 at 13:55
  • just one last question dont want to start another thread for it how do i give rownames for my monthly time series data ? for eg if i had yearly data rownames(data) <- seq(from=1927, to=2016) any idea about month ? Commented Jun 18, 2020 at 14:47

2 Answers 2

1

Putting all these concerns together, I performed some data preparation to obtain your desired output. First, as noted in the comments, I appended the first day of the month to each "year-month" so you can work with a proper date variable in R. Next, I used the column_to_rownames() function on the month_year column. I appended the year to the month name because duplicate (non-unique) row names are not permitted. I should caution you against using row labels. Quoting from the documentation (see ?tibble::rownames_to_column):

While a tibble can have row names (e.g., when converting from a regular data frame), they are removed when subsetting with the [ operator. A warning will be raised when attempting to assign non-NULL row names to a tibble. Generally, it is best to avoid row names, because they are basically a character column with different semantics than every other column.

You can manipulate the row names below with different naming conventions. Just make sure the labels are unique! See the R code below:

# Loading the required libraries

library(tibble)
library(ggplot2)
library(dplyr)
library(lubridate)

df <- tribble( 
  ~date, ~sales_revenue_incl_credit,
  "2017-07", 56037.46,
  "2017-08", 38333.9,
  "2017-09", 48716.92,
  "2017-10", 65447.67,
  "2017-11", 134752.57,
  "2017-12", 116477.39,
  "2018-01", 78167.25,
  "2018-02", 75991.44,
  "2018-03", 42520.93,
  "2018-04", 70489.92,
  "2018-05", 121063.35,
  "2018-06", 76308.47,
  "2018-07", 118085.7,
  "2018-08", 96153.38,
  "2018-09", 82827.1,
  "2018-10", 109288.83,
  "2018-11", 145774.52,
  "2018-12", 141572.77,
  "2019-01", 123055.83,
  "2019-02", 104232.24,
  "2019-03", 435086.33,
  "2019-04", 74304.96,
  "2019-05", 117237.82,
  "2019-06", 82013.47,
  "2019-07", 99382.67,
  "2019-08", 138455.2,
  "2019-09", 97301.99,
  "2019-10", 137206.09,
  "2019-11", 109862.44,
  "2019-12", 118150.96,
  "2020-01", 140717.9,
  "2020-02", 127622.3,
  "2020-03", 134126.09
  )

# Data preparation

df %>%
  mutate(date = ymd(paste0(date, "-01")),
         month_year = paste(month(date, label = TRUE), year(date), sep = "-")
         ) %>%
  column_to_rownames("month_year") %>%  # sets the column labels to row names
  head()

# Preview of the data frame with row names (e.g., Jul-2017, Aug-2017, Sep-2017, etc.)

               date sales_revenue_incl_credit
Jul-2017 2017-07-01                  56037.46
Aug-2017 2017-08-01                  38333.90
Sep-2017 2017-09-01                  48716.92
Oct-2017 2017-10-01                  65447.67
Nov-2017 2017-11-01                 134752.57
Dec-2017 2017-12-01                 116477.39

# Reproducing your plot

df %>%
  ggplot(aes(x = date, y = sales_revenue_incl_credit)) +
  geom_line() +
  scale_x_date(date_labels = "%b %Y", date_breaks = "1 month") +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5), 
        panel.grid.minor = element_blank())
Sign up to request clarification or add additional context in comments.

Comments

0

A simpler version of @Tom's answer is to use a tsibble object and the feasts package:

# Loading the required libraries

library(tibble)
library(dplyr)
library(ggplot2)
library(lubridate)
library(tsibble)
library(feasts)

# Data preparation

df <- tribble( 
    ~date, ~sales_revenue_incl_credit,
    "2017-07", 56037.46,
    "2017-08", 38333.9,
    "2017-09", 48716.92,
    "2017-10", 65447.67,
    "2017-11", 134752.57,
    "2017-12", 116477.39,
    "2018-01", 78167.25,
    "2018-02", 75991.44,
    "2018-03", 42520.93,
    "2018-04", 70489.92,
    "2018-05", 121063.35,
    "2018-06", 76308.47,
    "2018-07", 118085.7,
    "2018-08", 96153.38,
    "2018-09", 82827.1,
    "2018-10", 109288.83,
    "2018-11", 145774.52,
    "2018-12", 141572.77,
    "2019-01", 123055.83,
    "2019-02", 104232.24,
    "2019-03", 435086.33,
    "2019-04", 74304.96,
    "2019-05", 117237.82,
    "2019-06", 82013.47,
    "2019-07", 99382.67,
    "2019-08", 138455.2,
    "2019-09", 97301.99,
    "2019-10", 137206.09,
    "2019-11", 109862.44,
    "2019-12", 118150.96,
    "2020-01", 140717.9,
    "2020-02", 127622.3,
    "2020-03", 134126.09
  ) %>%
  mutate(date = yearmonth(date)) %>%
  as_tsibble(index=date)

# Reproducing your plot

df %>% autoplot(sales_revenue_incl_credit) +
  scale_x_yearmonth(breaks=seq(1e3)) +
  theme_bw() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5), 
        panel.grid.minor = element_blank())

Created on 2020-06-19 by the reprex package (v0.3.0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.