I do some big calculations on three different two-dimensional arrays with two sizes in series. The arrays are huge, 25000x25000 each. Each calculation takes considerable time, so I decided to run 3 of them in parallel on 3 processor cores on the server. I follow the standard principles of multiprocessing and create 2 processes and a working function. Two calculations are performed through 2 processes, and the third is performed locally without a separate process. I pass huge arrays as arguments to processes, such as:
p1 = Process(target = Worker, args = (queue1, array1, ...))
the Work function sends back two numpy vectors (1D array) to the list added to the queue, for example:
queue.put([v1, v2])
I do not use multiprocessing.pool
but itβs amazing that I donβt get acceleration, it actually works 3 times slower. Are large arrays taking time? I can not understand what is happening. Should I use shared memory objects instead of passing arrays?
I would be grateful if anyone could help.
Thanks.
python arrays numpy process multiprocessing
Sayantan
source share