I have some multiprocessing Python code that looks something like this:
import time from multiprocessing import Pool import numpy as np class MyClass(object): def __init__(self): self.myAttribute = np.zeros(100000000)
After reading answers about how memory works in other StackOverflow answers, such as Using Python multiprocessing memory , it seemed to me that this would not use memory proportionally to how many threads I used for multiprocessing, since this is copying and writing, but I did not change none of my_instance attributes. However, I see high memory for all processes, when I run the top, this indicates that most of my processes use a lot of memory (this is the top output from OSX, but I can replicate to Linux).
My question is basically, am I interpreting this correctly, since my instance of MyClass actually duplicated through the pool? And if so, how can I prevent this; Should I just not use such a design? My goal is to reduce memory usage for computational analysis.
PID COMMAND %CPU TIME #TH #WQ #PORT MEM PURG CMPRS PGRP PPID STATE 2494 Python 0.0 00:01.75 1 0 7 765M 0B 0B 2484 2484 sleeping 2493 Python 0.0 00:01.85 1 0 7 765M 0B 0B 2484 2484 sleeping 2492 Python 0.0 00:01.86 1 0 7 765M 0B 0B 2484 2484 sleeping 2491 Python 0.0 00:01.83 1 0 7 765M 0B 0B 2484 2484 sleeping 2490 Python 0.0 00:01.87 1 0 7 765M 0B 0B 2484 2484 sleeping 2489 Python 0.0 00:01.79 1 0 7 167M 0B 597M 2484 2484 sleeping 2488 Python 0.0 00:01.77 1 0 7 10M 0B 755M 2484 2484 sleeping 2487 Python 0.0 00:01.75 1 0 7 8724K 0B 756M 2484 2484 sleeping 2486 Python 0.0 00:01.78 1 0 7 9968K 0B 755M 2484 2484 sleeping 2485 Python 0.0 00:01.74 1 0 7 171M 0B 594M 2484 2484 sleeping 2484 Python 0.1 00:16.43 4 0 18 775M 0B 12K 2484 2235 sleeping
source share