How to get indices of k maximum values from a multidimensional numpy array

Question

How to get indices of k maximum values from a multidimensional numpy array

I looked through a few questions about StackOverflow , but could not find the appropriate answer. I want to get indices k maximum values from a numpy ndarray . This link discusses the same thing as for a 1D array. np.argsort for a 2D array that np.argsort to sorting elements in different ways. i.e

 Note: array elements are not unique.

input:

 import numpy as np n = np.arange(9).reshape(3,3) >>> n array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) s = n.argsort() >>> s array([[0, 1, 2], [0, 1, 2], [0, 1, 2]], dtype=int32)

Besides,

 import numpy as np n = np.arange(9).reshape(3,3) s = n.argsort(axis=None) >>>s array([0, 1, 2, 3, 4, 5, 6, 7, 8], dtype=int32)

but I am losing the structure of the array here and I can’t buy the original indexes of the elements.

Any help helper is appreciated.

+7

python numpy

Rashmi singh Apr 13 '17 at 7:46

source share

1 answer

Divakar · Accepted Answer · 2017-04-13T07:54:20+0000

A couple of approaches with np.argpartition and np.argsort for ndarrays are

 def k_largest_index_argpartition_v1(a, k): idx = np.argpartition(-a.ravel(),k)[:k] return np.column_stack(np.unravel_index(idx, a.shape)) def k_largest_index_argpartition_v2(a, k): idx = np.argpartition(a.ravel(),a.size-k)[-k:] return np.column_stack(np.unravel_index(idx, a.shape)) def k_largest_index_argsort(a, k): idx = np.argsort(a.ravel())[:-k-1:-1] return np.column_stack(np.unravel_index(idx, a.shape))

Discussion of two versions with argpartition

The difference between k_largest_index_argpartition_v1 and k_largest_index_argpartition_v2 is how we use argparition . In the first version, we negate the input array, and then using argpartition we get the indices for the smallest indices k , thereby effectively obtaining the largest indices k , while in the second version we get the first a.size-k smallest indices, and then we select the remaining largest indices k .

Also, it is worth mentioning here that with argpartition we do not get indexes in sorted order. If sorted order is needed, we need to pass the range array to np.argpartition , as indicated in this post .

Run Examples -

1) 2D case:

 In [42]: a # 2D array Out[42]: array([[38, 14, 81, 50], [17, 65, 60, 24], [64, 73, 25, 95]]) In [43]: k_largest_index_argsort(a, k=2) Out[43]: array([[2, 3], [0, 2]]) In [44]: k_largest_index_argsort(a, k=4) Out[44]: array([[2, 3], [0, 2], [2, 1], [1, 1]]) In [66]: k_largest_index_argpartition_v1(a, k=4) Out[66]: array([[2, 1], # Notice the order is different [2, 3], [0, 2], [1, 1]])

2) 3D case:

 In [46]: a # 3D array Out[46]: array([[[20, 98, 27, 73], [33, 78, 48, 59], [28, 91, 64, 70]], [[47, 34, 51, 19], [73, 38, 63, 94], [95, 25, 93, 64]]]) In [47]: k_largest_index_argsort(a, k=2) Out[47]: array([[0, 0, 1], [1, 2, 0]])

Runtime Test -

 In [56]: a = np.random.randint(0,99999999999999,(3000,4000)) In [57]: %timeit k_largest_index_argsort(a, k=10) 1 loops, best of 3: 2.18 s per loop In [58]: %timeit k_largest_index_argpartition_v1(a, k=10) 10 loops, best of 3: 178 ms per loop In [59]: %timeit k_largest_index_argpartition_v2(a, k=10) 10 loops, best of 3: 128 ms per loop

How to get indices of k maximum values ​​from a multidimensional numpy array

More articles:

How to get indices of k maximum values from a multidimensional numpy array