Passing a function that takes class member functions as variables in python multiprocess pool.map ()

Hi, I struggled with this for most of the morning and hoped that someone could point me in the right direction.

This is the code that I have at the moment:

def f(tup): return some_complex_function(*tup) def main(): pool = Pool(processes=4) #import and process data omitted _args = [(x.some_func1, .05, x.some_func2) for x in list_of_some_class] results = pool.map(f, _args) print results 

The first error I get:

 > Exception in thread Thread-2: Traceback (most recent call last): > File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner > self.run() File "/usr/lib/python2.7/threading.py", line 504, in run > self.__target(*self.__args, **self.__kwargs) File "/usr/lib/python2.7/multiprocessing/pool.py", line 319, in > _handle_tasks > put(task) PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed 

Any help would be greatly appreciated.

+6
source share
2 answers

The multiprocess module uses the pickle module to serialize the arguments passed to the function ( f ), which is executed in another process.

Many of the built-in types can be pickled, but instance methods cannot be pickled. So .05 is fine, but x.some_func1 not. See What can you pickle and sprinkle? for more details.

There is no easy solution. You will need to rebuild your program, so instance methods should not be passed as arguments (or avoid using multiprocess ).

+8
source

If you use a multiprocessing fork called pathos.multiprocesssing , you can directly use classes and class methods in multiprocess map functions. This is because cPickle used instead of pickle or cPickle , and dill can serialize almost anything in python.

pathos.multiprocessing also provides an asynchronous map function ... and it can map function with several arguments (for example, map(math.pow, [1,2,3], [4,5,6]) )

See: What can multiprocessing and dill do?

and: http://matthewrocklin.com/blog/work/2013/12/05/Parallelism-and-Serialization/

 >>> from pathos.multiprocessing import ProcessingPool as Pool >>> >>> p = Pool(4) >>> >>> def add(x,y): ... return x+y ... >>> x = [0,1,2,3] >>> y = [4,5,6,7] >>> >>> p.map(add, x, y) [4, 6, 8, 10] >>> >>> class Test(object): ... def plus(self, x, y): ... return x+y ... >>> t = Test() >>> >>> p.map(Test.plus, [t]*4, x, y) [4, 6, 8, 10] >>> >>> p.map(t.plus, x, y) [4, 6, 8, 10] 

Get the code here: https://github.com/uqfoundation/pathos

+3
source

All Articles