Max max When applied to a list of arrays

I am doing some calculations to get a list of numpy arrays. Subsequently, I would like to find the largest values ​​along the first axis. My current implementation (see below) is very slow, and I would like to find alternatives.

Original

pending = [<list of items>] matrix = [compute(item) for item in pending if <some condition on item>] dominant = np.max(matrix, axis = 0) 

Version 1: This implementation is faster (~ 10x; presumably because numpy does not need to determine the shape of the array)

 pending = [<list of items>] matrix = [compute(item) for item in pending if <some condition on item>] matrix = np.vstack(matrix) dominant = np.max(matrix, axis = 0) 

I did a couple of tests, and the slowdown seems to be due to the internal conversion of the list of arrays to a numpy array

  Timer unit: 1e-06 s Total time: 1.21389 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 4 def direct_max(list_of_arrays): 5 1000 1213886 1213.9 100.0 np.max(list_of_arrays, axis = 0) Total time: 1.20766 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 8 def numpy_max(list_of_arrays): 9 1000 1151281 1151.3 95.3 list_of_arrays = np.array(list_of_arrays) 10 1000 56384 56.4 4.7 np.max(list_of_arrays, axis = 0) Total time: 0.15437 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 12 @profile 13 def stack_max(list_of_arrays): 14 1000 102205 102.2 66.2 list_of_arrays = np.vstack(list_of_arrays) 15 1000 52165 52.2 33.8 np.max(list_of_arrays, axis = 0) 

Is there a way to speed up the max function, or can I effectively populate a numpy array with the results of my calculation, so is max fast?

+4
source share
1 answer

You can use reduce(np.maximum, matrix) , here is the test:

 import numpy as np np.random.seed(0) N, M = 1000, 1000 matrix = [np.random.rand(N) for _ in xrange(M)] %timeit np.max(matrix, axis = 0) %timeit np.max(np.vstack(matrix), axis = 0) %timeit reduce(np.maximum, matrix) 

Result:

 10 loops, best of 3: 116 ms per loop 10 loops, best of 3: 10.6 ms per loop 100 loops, best of 3: 3.66 ms per loop 

Edit

`argmax () 'is harder, but you can use a for loop:

 def argmax_list(matrix): m = matrix[0].copy() idx = np.zeros(len(m), dtype=np.int) for i, a in enumerate(matrix[1:], 1): mask = m < a m[mask] = a[mask] idx[mask] = i return idx 

It is still faster than argmax() :

 %timeit np.argmax(matrix, axis=0) %timeit np.argmax(np.vstack(matrix), axis=0) %timeit argmax_list(matrix) 

result:

 10 loops, best of 3: 131 ms per loop 10 loops, best of 3: 21 ms per loop 100 loops, best of 3: 13.1 ms per loop 
+3
source

All Articles