1

I am trying to use the lag function from the dplyr package. However when I give a lag > 0 I want the missing values to be replaced by the first value in x. How can we achieve this

library(dplyr)
x<-c(1,2,3,4)
z<-lag(x,2)
z
## [1] NA NA  1  2
4
  • Maybe z[is.na(z)] <- x[1]? Commented Feb 18, 2015 at 8:39
  • @DavidArenburg - yes it does: > stats::lag(1:10, 2) [1] NA NA 1 2 3 4 5 6 7 8 Commented Feb 18, 2015 at 8:55
  • @DavidArenburg you are right... so isn't stats::lag calling the stats-lag?! Commented Feb 18, 2015 at 9:01
  • 3
    @Tim it seems like some bug in dplyrs lag, this seems related Commented Feb 18, 2015 at 9:07

3 Answers 3

5

Since you are using the lag function dplyr, there is an argument default. So you can specify that you want x[1] to be the default.

lag(x, 2, default=x[1])
Sign up to request clarification or add additional context in comments.

Comments

3

Here's a modified function mylag:

mylag <- function(x, k = 1, ...)
  replace(lag(x, k, ...), seq(k), x[1])

x <- 1:4
mylag(x, k = 2)
# [1] 1 1 1 2

2 Comments

It gives me [1] 1 1 3 4.
@Pascal this is because Sven is also using lag from the dplyr package. welcome to Hadlywerse...
0

May I suggest adapting the function so that it works both ways: for lag and lead (positive AND negative lags).

shift = function(x, lag, fill=FALSE) {
  require(dplyr)
  switch(sign(lag)/2+1.5, 
         lead( x, n=abs(lag), default=switch(fill+1, NA, tail(x, 1))  ), 
         lag(  x, n=abs(lag), default=switch(fill+1, NA, head(x, 1))  )
  )
}

It has a "fill" argument that automatically fills with first of last value depending on the sign of the lag.

> shift(1:10, -1)
#### [1]  2  3  4  5  6  7  8  9 10 NA
> shift(1:10, +1, fill=TRUE)
#### [1] 1 1 2 3 4 5 6 7 8 9

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.