Today I was profiling a function, and I found (at least for me) a strange bottleneck: creating a masked array with mask=None or mask=0 to initialize the mask with all zeros, but with the same form as data very slow:
>>> import numpy as np >>> data = np.ones((100, 100, 100)) >>> %timeit ma_array = np.ma.array(data, mask=None, copy=False) 1 loop, best of 3: 803 ms per loop >>> %timeit ma_array = np.ma.array(data, mask=0, copy=False) 1 loop, best of 3: 807 ms per loop
on the other hand, using mask=False or creating a mask manually is much faster:
>>> %timeit ma_array = np.ma.array(data, mask=False, copy=False) 1000 loops, best of 3: 438 ยตs per loop >>> %timeit ma_array = np.ma.array(data, mask=np.zeros(data.shape, dtype=bool), copy=False) 1000 loops, best of 3: 453 ยตs per loop
Why is None or 0 almost 2000 times slower than False or np.zeros(data.shape) as a mask parameter? Given that the docs function only says that it:
Must be converted to an array of logic elements with the same form as the data. True indicates masked (i.e. invalid) data.
I am using python 3.5, numpy 1.11.0 on windows 10
source share