Seed control with mclapply

Imagine that we are performing a series of processes in which I want to set one common seed at the beginning of a program: for example,

mylist <- list( as.list(rep(NA,3)), as.list(rep(NA,3)) ) foo <- function(x){ for(i in 1:length(x)){ x[[i]] <- sample(100,1) } return(x) } # start block set.seed(1) l1 <- lapply(mylist, foo) l2 <- lapply(mylist, foo) # end 

Of course, in block l1 and l2 different, but if I run the above block again, l1 will be the same as before, and l2 will be the same as before.

Imagine foo is time consuming, so I want to use mclapply not lapply , so I do:

 library(parallel) # start block set.seed(1) mclapply(mylist , foo, mc.cores = 3) mclapply(mylist , foo, mc.cores = 3) # end 

If I run this block again, I will get different results next time. How do I create the same behavior as installing one common seed using lapply but using mclappy . I looked at the mclapply doc, but I'm not sure, because using:

 set.seed(1) l1 <- mclapply(mylist , foo, mc.cores = 3, mc.set.seed=FALSE) l2 <- mclapply(mylist , foo, mc.cores = 3, mc.set.seed=FALSE) 

leads to the fact that l1 and l2 coincide, which is not what I want ...

+5
source share
1 answer

The parallel package comes with special support for the L'Ecuyer-CMRG random number generator, which was introduced simultaneously with parallel . You can read the documentation for this support using:

 library(parallel) ?mc.reset.stream 

To use it, you first need to enable "L'Ecuyer-CMRG":

 RNGkind("L'Ecuyer-CMRG") 

After this code, for example:

 set.seed(1) mclapply(mylist, foo, mc.cores=3) mclapply(mylist, foo, mc.cores=3) 

will be reproducible, but two calls to mclapply will return the same results. This is due to the fact that the state of the random number generator in the main process does not change, causing mclapply .

I used the following function to skip random number streams used by mclapply workers:

 skip.streams <- function(n) { x <- .Random.seed for (i in seq_len(n)) x <- nextRNGStream(x) assign('.Random.seed', x, pos=.GlobalEnv) } 

You can use this function to get the behavior that I think you need:

 set.seed(1) mclapply(mylist, foo, mc.cores=3) skip.streams(3) mclapply(mylist, foo, mc.cores=3) skip.streams(3) 
+6
source

All Articles