Sort numpy array based on data from another array

Question

Sort numpy array based on data from another array

I have two sets of array dataand result. resultcontains the same elements in data, but with an extra column and in unsorted order. I want to change the array resultso that it is in the same order as the rows in data, while associating the value in the last column with the rest of the row when sorting.

data = np.array([[0,1,0,0],[1,0,0,0],[0,1,1,0],[0,1,0,1]])
result = np.array([[0,1,1,0,1],[1,0,0,0,0],[0,1,0,0,1],[0,1,0,1,0]])

# this is what the final sorted array should look like:
'''
array([[0, 1, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0]])
 '''

I tried to do argsortto cancel datain a sorted order, and then apply this to result, but argsortit seems to sort the order of the array based on each element, whereas I want to sort to process each row data[:,4]as a whole.

ind = np.argsort(data)
indind =np.argsort(ind)
ind
array([[0, 2, 3, 1],
   [1, 2, 3, 0],
   [0, 3, 1, 2],
   [0, 2, 1, 3]])

What is a good way to sort by rows?

+4

python sorting numpy

ROBOTPWNS 10 . '16 21:18

3

hpaulj · Answer 1 · 2016-04-10T21:47:10+0000

, . [2,1,0,3] result :

In [37]: result[[2,1,0,3],:]
Out[37]: 
array([[0, 1, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0]])

In [38]: result[[2,1,0,3],:4]==data
Out[38]: 
array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]], dtype=bool)

, argsort sort .

np.lexsort :

In [54]: data[np.lexsort(data.T),:]
Out[54]: 
array([[1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 1, 1, 0],
       [0, 1, 0, 1]])

In [55]: result[np.lexsort(result[:,:-1].T),:]
Out[55]: 
array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 1],
       [0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0]])

, . lexsort, , .

:

In [66]: i=np.lexsort(data.T)
In [67]: j=np.lexsort(result[:,:-1].T)
In [68]: j[i]
Out[68]: array([2, 1, 0, 3], dtype=int64)

In [69]: result[j[i],:]
Out[69]: 
array([[0, 1, 0, 0, 1],
       [1, 0, 0, 0, 0],
       [0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0]])

. . .

Divakar · Answer 2 · 2016-04-10T22:04:01+0000

№ 1

, , data result, . , . :

# Slice out from result everything except the last column       
r = result[:,:-1]       

# Get linear indices equivalent of each row from r and data
ID1 = np.ravel_multi_index(r.T,r.max(0)+1)
ID2 = np.ravel_multi_index(data.T,r.max(0)+1)

# Search for ID2 in ID1 and use those indices index into result
out = result[np.where(ID1[:,None] == ID2)[1]]

# 2

data result, , argsort, :

# Slice out from result everything except the last column       
r = result[:,:-1]       

# Get linear indices equivalent of each row from r and data
ID1 = np.ravel_multi_index(r.T,r.max(0)+1)
ID2 = np.ravel_multi_index(data.T,r.max(0)+1)   

sortidx_ID1 = ID1.argsort()
sortidx_ID2 = ID2.argsort()
out = result[sortidx_ID1[sortidx_ID2]]

-

In [37]: data
Out[37]: 
array([[ 3,  2,  1,  5],
       [ 4,  9,  2,  4],
       [ 7,  3,  9, 11],
       [ 5,  9,  4,  4]])

In [38]: result
Out[38]: 
array([[ 7,  3,  9, 11, 55],
       [ 4,  9,  2,  4,  8],
       [ 3,  2,  1,  5,  7],
       [ 5,  9,  4,  4, 88]])

In [39]: r = result[:,:-1]
    ...: ID1 = np.ravel_multi_index(r.T,r.max(0)+1)
    ...: ID2 = np.ravel_multi_index(data.T,r.max(0)+1)
    ...: 

In [40]: result[np.where(ID1[:,None] == ID2)[1]] # Approach 1
Out[40]: 
array([[ 3,  2,  1,  5,  7],
       [ 4,  9,  2,  4,  8],
       [ 7,  3,  9, 11, 55],
       [ 5,  9,  4,  4, 88]])

In [41]: sortidx_ID1 = ID1.argsort()  # Approach 2
    ...: sortidx_ID2 = ID2.argsort()
    ...: 

In [42]: result[sortidx_ID1[sortidx_ID2]]
Out[42]: 
array([[ 3,  2,  1,  5,  7],
       [ 4,  9,  2,  4,  8],
       [ 7,  3,  9, 11, 55],
       [ 5,  9,  4,  4, 88]])

Eelco hoogendoorn · Answer 3 · 2016-04-11T06:17:12+0000

numpy_indexed ( : ) :

import numpy_indexed as npi
result[npi.indices(result[:, :-1], data)]

npi.indices list.index; () , , .

Note that this solution works for any number of columns and is fully vectorized (i.e. no pythons anywhere).

Sort numpy array based on data from another array

More articles: