High memory usage using Python multiprocessing

I saw a couple of messages about memory usage using the Pipon Multiprocessing module. However, the questions do not seem to answer the problem that I am here. I am posting my analysis with the hope that someone can help me.

Problem

I use multiprocessing to execute tasks in parallel, and I noticed that memory consumption by work processes is growing endlessly. I have a small separate example that should repeat what I notice.

import multiprocessing as mp
import time

def calculate(num):
    l = [num*num for num in range(num)]
    s = sum(l)
    del l       # delete lists as an  option
    return s

if __name__ == "__main__":
    pool = mp.Pool(processes=2)
    time.sleep(5)
    print "launching calculation"
    num_tasks = 1000
    tasks =  [pool.apply_async(calculate,(i,)) for i in range(num_tasks)]
    for f in tasks:    
        print f.get(5)
    print "calculation finished"
    time.sleep(10)
    print "closing  pool"
    pool.close()
    print "closed pool"
    print "joining pool"
    pool.join()
    print "joined pool"
    time.sleep(5)

System

I start Windows and I use the task manager to control memory usage. I am running Python 2.7.6.

Observation

I have summarized the memory consumption of the two workflows below.

+---------------+----------------------+----------------------+
|  num_tasks    |  memory with del     | memory without del   |
|               | proc_1   | proc_2    | proc_1   | proc_2    |
+---------------+----------------------+----------------------+
| 1000          | 4884     | 4694      | 4892     | 4952      |
| 5000          | 5588     | 5596      | 6140     | 6268      |
| 10000         | 6528     | 6580      | 6640     | 6644      |
+---------------+----------------------+----------------------+

, , join - <<23 > . "del" " del" - , del l calculate(num), . 4400.

  • , . , . ?
  • , . ?

, . , (~ 4 ) . join , join -ing.

. - - ? ?

+26
1

, . , , .

. Pool python maxtasksperchild . maxtasksperchild=1000, 1000 , . maxtasksperchild . , , , , . Pool :

pool = mp.Pool(processes=2,maxtasksperchild=1000)

, !

import multiprocessing as mp
import time

def calculate(num):
    l = [num*num for num in range(num)]
    s = sum(l)
    del l       # delete lists as an  option
    return s

if __name__ == "__main__":

    # fix is in the following line #
    pool = mp.Pool(processes=2,maxtasksperchild=1000)

    time.sleep(5)
    print "launching calculation"
    num_tasks = 1000
    tasks =  [pool.apply_async(calculate,(i,)) for i in range(num_tasks)]
    for f in tasks:    
        print f.get(5)
    print "calculation finished"
    time.sleep(10)
    print "closing  pool"
    pool.close()
    print "closed pool"
    print "joining pool"
    pool.join()
    print "joined pool"
    time.sleep(5)
+44

All Articles