This post discusses the timings of various basic R methods for this calculation. This post is inspired by the comments of this post and the comments of @josilber in the post to the fastest method posted by Jake Burkhead.
Various methods are used below to calculate random walks. To do this, each function draws 1000 values ββof 1 or -1, as defined in fnc below. The time test uses a microbenchmark with 1000 replications for each method.
fnc <- function(n) sample(c(1L, -1L), n, replace=TRUE) library(microbenchmark) microbenchmark(all=cumsum(fnc(1000L)), reduce=Reduce("+", fnc(1000L), accumulate=TRUE), laplyRpCln=cumsum(unlist(lapply(rep.int(1L, 1000L), fnc))), laplyRpAn=cumsum(unlist(lapply(rep.int(1L, 1000L), function(x) fnc(1L)))), laplySqAn=cumsum(unlist(lapply(seq_len(1000L), function(x) fnc(1L)))), saplyRpCln=cumsum(sapply(rep.int(1L, 1000L), fnc)), saplyRpAn=cumsum(sapply(rep.int(1L, 1000L), function(x) fnc(1L))), saplySqAn=cumsum(sapply(seq_len(1000L), function(x) fnc(1L))), vaplyRpCln=cumsum(vapply(rep.int(1L, 1000L), fnc, FUN.VALUE=0)), vaplyRpAn=cumsum(vapply(rep.int(1L, 1000L), function(x) fnc(1L), FUN.VALUE=0)), vaplySqAn=cumsum(vapply(seq_len(1000L), function(x) fnc(1L), FUN.VALUE=0)), replicate=cumsum(replicate(1000L, fnc(1L))), forPre={vals <- numeric(1000L); for(i in seq_along(vals)) vals[i] <- fnc(1L); cumsum(vals)}, forNoPre={vals <- numeric(0L); for(i in seq_len(1000L)) vals <- c(vals, fnc(1L)); cumsum(vals)}, times=1000)
Here
- "everyone" uses Jake Burkhead's suggestion,
cumsum and immediately takes out a sample. - reduce reduces the sample immediately, but uses
Reduce to perform the summation. - laplyRpCln uses
lapply and unlist to return a vector and iterate through 1000 instances of 1, calling the function directly by name. - laplyRpAn differs by using an anonymous function.
- laplySqAn uses an anonymous function and creates an iteration variable with
seq , not rep . - saplyRpCln, laplyRpAn, laplySqAn are the same as laplyRpCln, etc., except that
sapply is called instead of lapply / unlist . - vaplyRpCln etc. same as laplyRpCln, etc., except that
vapply used instead of lapply / unlist . - replicate is a call to
replicate where the default value is simplified = TRUE. - forPre uses a
for loop that preallocates a vector and populates it. - forNoPre uses a
for loop, which creates an empty numeric(0) vector, and then uses c to concatenate this vector.
It returns
Unit: microseconds expr min lq mean median uq max neval cld all 25.634 31.0705 85.66495 33.6890 35.3400 49240.30 1000 a reduce 542.073 646.7720 780.13592 696.4775 750.2025 51685.44 1000 b laplyRpCln 4349.384 5026.4015 6433.60754 5409.2485 7209.3405 58494.44 1000 ce laplyRpAn 4600.200 5281.6190 6513.58733 5682.0570 7488.0865 55239.04 1000 ce laplySqAn 4616.986 5251.4685 6514.09770 5634.9065 7488.1560 54263.04 1000 ce saplyRpCln 4362.324 5080.3970 6325.66531 5506.5330 7294.6225 59075.02 1000 cd saplyRpAn 4701.140 5386.1350 6781.95655 5786.6905 7587.8525 55429.02 1000 e saplySqAn 4651.682 5342.5390 6551.35939 5735.0610 7525.4725 55148.32 1000 ce vaplyRpCln 4366.322 5046.0625 6270.66501 5482.8565 7208.0680 63756.83 1000 c vaplyRpAn 4657.256 5347.2190 6724.35226 5818.5225 7580.3695 64513.37 1000 de vaplySqAn 4623.897 5325.6230 6475.97938 5769.8130 7541.3895 14614.67 1000 ce replicate 4722.540 5395.1420 6653.90306 5777.3045 7638.8085 59376.89 1000 ce forPre 5911.107 6823.3040 8172.41411 7226.7820 9038.9550 56119.11 1000 f forNoPre 8431.855 10584.6535 11401.64190 10910.0480 11267.5605 58566.27 1000 g
Please note that the first method is by far the fastest. Then, pull out the complete sample at a time, and then use Reduce to add. Among the *apply functions, the βcleanβ versions using the function name appear to have a slight performance improvement, and the lapply version lapply to be on par with vapply , but given the range of values, this conclusion is not entirely straightforward. sapply seems to be the slowest, although a function invocation method dominates the *apply function type.
Two for loops did the worst, and the pre- for loop exceeded the for loop growing with c .
Here I launch the corrected version 3.4.1 (fixed around August 23, 2017) on openSuse 42.1.
Please let me know if you see any errors and I will correct them as soon as I can. Thanks to Ben Bolker for pushing me to further investigate the final function, where I discovered a couple of errors.