This code below best illustrates my problem:
The output to the console (it takes about 8 minutes to start even the first test) displays the allocation of arrays of 512x512x512x16 bits consuming no more than expected (256 MB for each), and looking at the "top" process as a whole remains under-600MByte, as expected.
However , while a vectorized version of the function is being called, the process expands to a huge size (more than 7 GB!). Even the most obvious explanation I can think of is vectorization β converting the input and output to float64 inside β can only be a couple gigabytes, even if the vectorized function returns int16, and the returned array is certainly int16. Is there any way to avoid this? Do I use / understand that the vectorization of the otypes argument is incorrect?
import numpy as np import subprocess def logmem(): subprocess.call('cat /proc/meminfo | grep MemFree',shell=True) def fn(x): return np.int16(x*x) def test_plain(v): print "Explicit looping:" logmem() r=np.zeros(v.shape,dtype=np.int16) for z in xrange(v.shape[0]): for y in xrange(v.shape[1]): for x in xrange(v.shape[2]): r[z,y,x]=fn(x) print type(r[0,0,0]) logmem() return r vecfn=np.vectorize(fn,otypes=[np.int16]) def test_vectorize(v): print "Vectorize:" logmem() r=vecfn(v) print type(r[0,0,0]) logmem() return r logmem() s=(512,512,512) v=np.ones(s,dtype=np.int16) logmem() test_plain(v) test_vectorize(v) v=None logmem()
I use this or that version of Python / numpy on the amd64 Debian Squeeze system (Python 2.6.6, numpy 1.4.1).
timday
source share