I edited the code from the previous answers, so that they are functions that take the same inputs (one-dimensional time series) and return the same result (vector of days since the last n-day maximum):
daysSinceHigh1 <- function(x,n) { as.vector(n-rollapply(x, n, which.max)) } daysSinceHigh2 <- function(x, n){ apply(embed(x, n), 1, which.max)-1 }
The second function seems to be the fastest, but they give slightly different results:
> getSymbols("^GSPC",from='01-01-1900') [1] "GSPC" > system.time(x <- daysSinceHigh1(Cl(GSPC), 200)) user system elapsed 0.42 0.00 0.42 > system.time(y <- daysSinceHigh2(Cl(GSPC), 200)) user system elapsed 0.24 0.00 0.24 > all.equal(x,y) [1] "Mean relative difference: 0.005025126"
Upon closer examination, it turns out that in the first function there are some strange edge cases:
data <- c(1,2,3,4,5,6,7,7,6,5,6,7,8,5,4,3,2,1) answer <- c(0,0,0,0,1,2,3,0,0,1,2,3,4,4) x <- daysSinceHigh1(data, 5) y <- daysSinceHigh2(data, 5) > x [1] 0 0 0 1 2 3 4 4 0 1 2 3 4 4 > y [1] 0 0 0 0 1 2 3 0 0 1 2 3 4 4 > answer [1] 0 0 0 0 1 2 3 0 0 1 2 3 4 4 > all.equal(x,answer) [1] "Mean relative difference: 0.5714286" > all.equal(y,answer) [1] TRUE
Therefore, it seems that the second function (based on Andrie code) is better.