Python how to multiprocess inside a class?

I have a code structure that looks like this:

Class A: def __init__(self): processes = [] for i in range(1000): p = Process(target=self.RunProcess, args=i) processes.append[p] # Start all processes [x.start() for x in processes] def RunProcess(self, i): do something with i... 

Main script:

 myA = A() 

I can't get this to work. I get a runtime error " An attempt was made to start a new process before the current process has completed its bootstrap phase. "

How do I get multiple processes to work? If I use Threading, it works fine, but it is slower than sequential ... And I also fear that multiple processing will also be slow because it takes longer to create a process?

Any good tips? Thank you very much in advance.

+5
source share
3 answers

There are a couple of syntax issues that I see in your code:

  • args in Process expecting a tuple, you are passing an integer, please change line 5 to:

    p = Process(target=self.RunProcess, args=(i,))

  • list.append is a method and the arguments passed to it should be enclosed in () , not [] , please change line 6 to:

    processes.append(p)

As @qarma points out, its not a good practice to start processes in the class constructor. I would structure the code as follows (adapting your example):

 import multiprocessing as mp from time import sleep class A(object): def __init__(self, *args, **kwargs): # do other stuff pass def do_something(self, i): sleep(0.2) print('%s * %s = %s' % (i, i, i*i)) def run(self): processes = [] for i in range(1000): p = mp.Process(target=self.do_something, args=(i,)) processes.append(p) [x.start() for x in processes] if __name__ == '__main__': a = A() a.run() 
+3
source

This should make it easier for you to use the Pool . As for speed, starting processes takes time. However, using Pool , unlike starting njobs Process should be as fast as you can get it working with processes. The default value for Pool (as used below) is to use the maximum number of available processes (i.e. the number of CPUs you use) and continue working with new jobs for the worker as soon as the task is completed. You will not get njobs -way in parallel, but you will get so much parallelism that your processors can process without re-subscribing your processors. I use pathos , which has a multiprocessing fork, because it is a little more reliable than standard multiprocessing ... and, well, I'm also an author. But you could use multiprocessing for this.

 >>> from pathos.multiprocessing import ProcessingPool as Pool >>> class A(object): ... def __init__(self, njobs=1000): ... self.map = Pool().map ... self.njobs = njobs ... self.start() ... def start(self): ... self.result = self.map(self.RunProcess, range(self.njobs)) ... return self.result ... def RunProcess(self, i): ... return i*i ... >>> myA = A() >>> myA.result[:11] [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100] >>> myA.njobs = 3 >>> myA.start() [0, 1, 4] 

This is a bit of a strange construct to run Pool inside __init__ . But if you want to do this, you need to get results from something like self.result ... and you can use self.start for subsequent calls.

Get pathos here: https://github.com/uqfoundation

+1
source

Practical work is a breakdown of your class, for example. eg:

 class A: def __init__(self, ...): pass def compute(self): procs = [Process(self.run, ...) for ... in ...] [p.start() for p in procs] [p.join() for p in procs] def run(self, ...): pass pool = A(...) pool.compute() 

When you roll back the process inside __init__ , the instance of the self class may not be completely initialized, so it is odd to ask the subprocess to execute self.run , although technically, yes, it is possible.

If it is not, then it looks like an instance of this problem:

http://bugs.python.org/issue11240

0
source

Source: https://habr.com/ru/post/1215222/


All Articles