How multiprocessing works, in a nutshell:
Process() spawns ( fork or similar on Unix-like systems) is a copy of the source program (on Windows, which lacks a real fork , this is complicated and requires special care that there are notes to the module documentation).- The copy is linked to the original to find out that (a) it is a copy, and (b) it should exit and call the
target= function (see below). - At this point, the original and copy are now different and independent and can be executed simultaneously.
Since they are independent processes, they now have independent global interpreter locks (in CPython), so both can use up to 100% CPU in a multipoint field if they are not compatible with other lower values, (OS). This is part of multiprocessing.
Of course, at some point you need to send data back and forth between these supposedly independent processes, for example, send the results from one (or many) workflows back to the "main" process. (There is a random exception, where each one is completely independent, but it is rare ... plus there is the whole startup sequence, starting with p.start() .) Thus, each instance of p created by Process , in the above example, has a communication channel with by its parent creator and vice versa (this is a symmetrical connection). The multiprocessing module uses the pickle module to convert data to strings - the same strings that you can store in files using pickle.dump , and sends data down the pickle.dump to workers to send arguments, etc., AND " up "from workers to send results.
In the end, as soon as you finish getting the results, the employee will finish (returning from the target= function) and telling his parents what he did. To make sure everything is closed and cleaned up, the parent must call p.join() to wait for the working user "I am done" message (actually OS-level exit in Unix-ish sysems).
The example is a bit stupid, since two printed messages do not have time at all, so starting them โsimultaneouslyโ has no measurable gain. But suppose that instead of typing hello , f should have computed the first 100,000 digits of ฯ (3.14159 ...). Then you can create another Process , p2 with another target g , which calculates the first 100,000 digits of e (2.71828 ...). They will work independently. Then the parent could call p.join() and p2.join() to wait for both of them to finish (or create even more workers, work more and occupy more processors, or even leave and do their own work for a while).
torek
source share