Using Multithreaded Packages in R

I need my multi-threaded R application since it takes 5 minutes and it uses only 15% of the computers available for the CPU.

An example process that takes time to start calculates the average of a very large raster stack containing layers n :

mean = cellStats(raster_layers[[n]], stat='sd', na.rm=TRUE)

Using the parallel library , I can create a new cluster and pass a function to it:

cl <- makeCluster(8, type = "SOCK")
parLapply(cl, raster_layers[[1]], mean_function)
stopCluster(cl)

where is the average function:

mean_function <- function(raster_object)
{
result = cellStats(raster_object, stat='mean', na.rm=TRUE)
return(result)
}

This method works just fine, except that it cannot see the raster package that cellStats needs to use . Thus, he does not say that there is no function for cellStats. I tried including the library inside the function, but that does not help.

, cellStats, , , , , ... parallel, .

, - , node R? , , ?

+4
1

, .

:

mean_function <- function(variable)
{
result = cellStats(variable, stat='mean', na.rm=TRUE)
return(result)
}

cl <- makeCluster(procs, type = "SOCK")
clusterEvalQ(cl, library(raster))   
result = parLapply(cl, a_list, mean_function)
stopCluster(cl)

procs - , , , , ( a_list).

a_list , , cellStats. , a_list - , procs .

+3

All Articles