How to vectorize triple nested loops?

I did a search for similar problems, and I have a vague idea of ​​what to do: to vectorize everything or use the apply() family. But I'm new to R programming, and both of these methods are pretty confusing.

Here is my source code:

 x<-rlnorm(100,0,1.6) j=0 k=0 i=0 h=0 lambda<-rep(0,200) sum1<-rep(0,200) constjk=0 wj=0 wk=0 for (h in 1:200) { lambda[h]=2+h/12.5 N=ceiling(lambda[h]*max(x)) for (j in 0:N) { wj=(sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100 for (k in 0:N) { constjk=dbinom(k, j + k, 0.5) wk=(sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100 sum1[h]=sum1[h]+(lambda[h]/2)*constjk*wk*wj } } } 

Let me clarify a bit. I want to collect 200 values ​​of sum1 (this is the first cycle), and for each value of sum1 this is a summation (lambda[h]/2)*constjk*wk*wj , so there are two other cycles. The most tedious thing is that N changes with h, so I have no idea how to vectorize the j-cycle and k-cycle. But of course, I can vectorize the h-loop with lambda<-seq() and N<-ceiling() , and this is the best I can do. Is there a way to simplify the code?

+7
source share
2 answers

Your code can be completely redesigned with 3 nested sapply calls. It may be a little difficult to read for the unprepared eye, but the gist of it is that instead of adding one value at a time to sum1[h] , we calculate all the terms created by the innermost loop at a time and sum them up.

Although this vectorized solution is faster than your triplex for loop, the improvement is not dramatic. If you plan to use it many times, I suggest you implement it in C or Fortran (with regular for loops), which greatly improves speed. Beware, however, that it has high temporal complexity and will deteriorate greatly with increased lambda values, ultimately reaching the point where it is impossible to calculate in a reasonable amount of time, regardless of implementation.

 lambda <- 2 + 1:200/12.5 sum1 <- sapply(lambda, function(l){ N <- ceiling(l*max(x)) sum(sapply(0:N, function(j){ wj <- (sum(x <= (j+1)/l) - sum(x <= j/l))/100 sum(sapply(0:N, function(k){ constjk <- dbinom(k, j + k, 0.5) wk <- (sum(x <= (k+1)/l) - sum(x <= k/l))/100 l/2*constjk*wk*wj })) })) }) 

Btw, you don’t need to predefine variables like h , j , k , wj and wk . Moreover, not during vectorization, since assigning them inside functions passed to sapply will create superimposed local variables with the same name (i.e., ignoring those that you predetermined).

+5
source

Let's finish your simulation in function and time:

 sim1 <- function(num=20){ set.seed(42) x<-rlnorm(100,0,1.6) j=0 k=0 i=0 h=0 lambda<-rep(0,num) sum1<-rep(0,num) constjk=0 wj=0 wk=0 for (h in 1:num) { lambda[h]=2+h/12.5 N=ceiling(lambda[h]*max(x)) for (j in 0:N) { wj=(sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100 for (k in 0:N) { set.seed(42) constjk=dbinom(k, j + k, 0.5) wk=(sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100 sum1[h]=sum1[h]+(lambda[h]/2)*constjk*wk*wj } } } sum1 } system.time(res1 <- sim1()) # user system elapsed # 5.4 0.0 5.4 

Now do it faster:

 sim2 <- function(num=20){ set.seed(42) #to make it reproducible x <- rlnorm(100,0,1.6) h <- 1:num sum1 <- numeric(num) lambda <- 2+1:num/12.5 N <- ceiling(lambda*max(x)) #functions for wj and wk wjfun <- function(x,j,lambda,h){ (sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100 } wkfun <- function(x,k,lambda,h){ (sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100 } #function to calculate values of sum1 fun1 <- function(N,h,x,lambda) { sum1 <- 0 set.seed(42) #to make it reproducible #calculate constants using outer const <- outer(0:N[h],0:N[h],FUN=function(j,k) dbinom(k, j + k, 0.5)) wk <- numeric(N[h]+1) #loop only once to calculate wk for (k in 0:N[h]){ wk[k+1] <- (sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100 } for (j in 0:N[h]) { wj <- (sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100 for (k in 0:N[h]) { sum1 <- sum1+(lambda[h]/2)*const[j+1,k+1]*wk[k+1]*wj } } sum1 } for (h in 1:num) { sum1[h] <- fun1(N,h,x,lambda) } sum1 } system.time(res2 <- sim2()) #user system elapsed #1.25 0.00 1.25 all.equal(res1,res2) #[1] TRUE 

Timing for @Backlin code (with 20 interations) to compare:

  user system elapsed 3.30 0.00 3.29 

If it is still too slow, and you cannot or do not want to use another language, there is also the possibility of parallelization. As far as I can see, the outer loop is awkwardly parallel. There are some nice and simple packages for parallelization.

+2
source

All Articles