I have a list of fairly large objects to which I want to apply a complex function in parallel, but my current method uses too much memory. I thought Reference Classes might help, but using mcapply to change them does not work.
The function modifies the object itself, so I overwrite the original object with a new one. Since the object is a list, and I only modify a small part of it, I was hoping that the semantics of R copy-on-modify would not allow multiple copies to be made; however, when you run it, it does not seem to be the way I do. Here is a small example of the basic R methods that I used. It correctly resets scales to zero.
#
It seems that using reference classes can help, as they are mutable, and when using lapply it does what I expect; The reset balance is zero.
Account <- setRefClass("Account", fields=list(balance="numeric"), methods=list(reset=function() {balance <<- 0})) foo <- lapply(1:5, function(x) Account$new(balance=x)) foo[[4]]$balance
But when I use mclapply , it does not match reset. Note that if you are on Windows or have mc.cores=1 , lapply will be called lapply .
foo <- lapply(1:5, function(x) Account$new(balance=x)) foo[[4]]$balance
What's happening? How can I work with base classes in parallel? Is there a better way to avoid unnecessarily copying objects?
Aaron
source share