Numpy's unique 2D sub-band

Question

Numpy's unique 2D sub-band

I have a numpy 3D array and I only want unique 2D sub-arrays.

Input:

[[[ 1 2] [ 3 4]] [[ 5 6] [ 7 8]] [[ 9 10] [11 12]] [[ 5 6] [ 7 8]]]

Output:

 [[[ 1 2] [ 3 4]] [[ 5 6] [ 7 8]] [[ 9 10] [11 12]]]

I tried converting the subarrays to the string (tostring ()) method and then using np.unique, but after converting to a numpy array, it deleted the last bytes \ x00, so I cannot convert it using np. fromstring ().

Example:

 import numpy as np a = np.array([[[1,2],[3,4]],[[5,6],[7,8]],[[9,10],[11,12]],[[5,6],[7,8]]]) b = [x.tostring() for x in a] print(b) c = np.array(b) print(c) print(np.array([np.fromstring(x) for x in c]))

Output:

 [b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00', b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00'] [b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04' b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08' b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c' b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08'] --------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-86-6772b096689f> in <module>() 5 c = np.array(b) 6 print(c) ----> 7 print(np.array([np.fromstring(x) for x in c])) <ipython-input-86-6772b096689f> in <listcomp>(.0) 5 c = np.array(b) 6 print(c) ----> 7 print(np.array([np.fromstring(x) for x in c])) ValueError: string size must be a multiple of element size

I also tried browsing, but I really don't know how to use it. Can you help me?

+7

python numpy unique sub-array

Peťan Nov 18 '16 at 10:28

source share

3 answers

One solution would be to use a set to keep track of which auxiliary arrays you saw:

 seen = set([]) new_a = [] for j in a: f = tuple(list(j.flatten())) if f not in seen: new_a.append(j) seen.add(f) print np.array(new_a)

Or using only numpy:

 print np.unique(a).reshape((len(unique) / 4, 2, 2)) >>> [[[ 1 2] [ 3 4]] [[ 5 6] [ 7 8]] [[ 9 10] [11 12]]]

+1

kezzos Nov 18 '16 at 10:45

source share

The numpy_indexed package (disclaimer: I am its author) is intended for the effective and vectorized operation of such operations:

 import numpy_indexed as npi npi.unique(a)

+1

Eelco hoogendoorn Nov 18 '16 at 10:53

source share

Divakar · Accepted Answer · 2016-11-18T14:11:52+0000

Using @Jaime post to solve our case of finding unique 2D subarrays, I came up with this solution, which basically adds a change to the view step

 def unique2D_subarray(a): dtype1 = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:]))) b = np.ascontiguousarray(a.reshape(a.shape[0],-1)).view(dtype1) return a[np.unique(b, return_index=1)[1]]

Run Example -

 In [62]: a Out[62]: array([[[ 1, 2], [ 3, 4]], [[ 5, 6], [ 7, 8]], [[ 9, 10], [11, 12]], [[ 5, 6], [ 7, 8]]]) In [63]: unique2D_subarray(a) Out[63]: array([[[ 1, 2], [ 3, 4]], [[ 5, 6], [ 7, 8]], [[ 9, 10], [11, 12]]])

Numpy's unique 2D sub-band

More articles: