Numpy: quick calculations that take into account the neighbors of objects and their position inside the array

Question

Numpy: quick calculations that take into account the neighbors of objects and their position inside the array

I have 4 2D numpy arrays called a, b, c, d , each of which consists of columns n and columns m . I need to give each element b and d value calculated as follows (pseudocode):

 min_coords = min_of_neighbors_coords(x, y) b[x,y] = a[x,y] * a[min_coords]; d[x,y] = c[min_coords];

Where min_of_neighbors_coords is a function that, given the coordinates of an array element, returns the coordinates of the neighbor element, which has a lower value. Ie considering an array:

 1, 2, 5 3, 7, 2 2, 3, 6

min_of_neighbors_coords(1, 1) will refer to the central element with a value of 7 and will return a tuple (0, 0) : the coordinates of the number 1 .

I managed to do this using for loops (element by element), but the algorithm is VERY slow, and I'm looking for a way to improve it by avoiding loops and requiring numpy computation.

Is it possible?

+6

python numpy

diegogb Apr 3 '13 at 22:19

source share

3 answers

Detection a[min_coords] - rolling operation. A few smart solutions that we talked about in this post . You will want to make creating an array with [[min_coords] a side effect of any solution you choose.

Hope this helps. I can post some code examples later when I have time.

+2

Paul Apr 3 '13 at 23:03

source share

I am interested in helping you, and I believe that there may be more effective solutions that go beyond the scope of your question, but in order to put my own time in writing code, I should get some feedback from you, because that I’m not 100% sure that I understand what you need.

One thing to keep in mind: if you are a C # developer, perhaps implementing C # “brute force” can outperform Numpy's smart implementation, so you might want to consider testing your fairly simple operations implemented in C #. Geotiff (which I suppose you are reading) has a relatively friendly specification, and I think there might be .NET GeoTiff libraries there.

But suppose you want to give Numpy a try (and I think that you should), let's see what you are trying to achieve:

If you intend to run min_coords(array) on each element of arrays a and c , you might think of a “stack” of nine copies of the same array, with each copy being copied using some offset using numpy.dstack() and numpy.roll() . Then you use numpy.argmin(stacked_array, axis=2) and get an array containing values from 0 to 8, where each of these values is mapped to a tuple containing offset indices.

Then, using this principle, your min_coords() function will be vectorized, working in the entire array at once and issuing an array that will give you the offset, which will be the index of the lookup table containing the offsets.

If you have an interest in developing this, leave a comment.

Hope this helps!

+2

heltonbiker Apr 3 '13 at 23:39

source share

Jaime · Accepted Answer · 2013-04-04T00:11:21+0000

EDIT I kept my original answer below. As Paul points out in the comments, the original answer didn’t really answer the OP question, and it would be easier to achieve using the ndimage filter. The next much more cumbersome function should do the right thing. It takes two arrays a and c and returns the minimum window size a and the values in c at the positions of the window minima in a :

 def neighbor_min(a, c): ac = np.concatenate((a[None], c[None])) rows, cols = ac.shape[1:] ret = np.empty_like(ac) # Fill in the center win_ac = as_strided(ac, shape=(2, rows-2, cols, 3), strides=ac.strides+ac.strides[1:2]) win_ac = win_ac[np.ogrid[:2, :rows-2, :cols] + [np.argmin(win_ac[0], axis=2)]] win_ac = as_strided(win_ac, shape=(2, rows-2, cols-2, 3), strides=win_ac.strides+win_ac.strides[2:3]) ret[:, 1:-1, 1:-1] = win_ac[np.ogrid[:2, :rows-2, :cols-2] + [np.argmin(win_ac[0], axis=2)]] # Fill the top, bottom, left and right borders win_ac = as_strided(ac[:, :2, :], shape=(2, 2, cols-2, 3), strides=ac.strides+ac.strides[2:3]) win_ac = win_ac[np.ogrid[:2, :2, :cols-2] + [np.argmin(win_ac[0], axis=2)]] ret[:, 0, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0), np.ogrid[:cols-2]] win_ac = as_strided(ac[:, -2:, :], shape=(2, 2, cols-2, 3), strides=ac.strides+ac.strides[2:3]) win_ac = win_ac[np.ogrid[:2, :2, :cols-2] + [np.argmin(win_ac[0], axis=2)]] ret[:, -1, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0), np.ogrid[:cols-2]] win_ac = as_strided(ac[:, :, :2], shape=(2, rows-2, 2, 3), strides=ac.strides+ac.strides[1:2]) win_ac = win_ac[np.ogrid[:2, :rows-2, :2] + [np.argmin(win_ac[0], axis=2)]] ret[:, 1:-1, 0] = win_ac[:, np.ogrid[:rows-2], np.argmin(win_ac[0], axis=1)] win_ac = as_strided(ac[:, :, -2:], shape=(2, rows-2, 2, 3), strides=ac.strides+ac.strides[1:2]) win_ac = win_ac[np.ogrid[:2, :rows-2, :2] + [np.argmin(win_ac[0], axis=2)]] ret[:, 1:-1, -1] = win_ac[:, np.ogrid[:rows-2], np.argmin(win_ac[0], axis=1)] # Fill the corners win_ac = ac[:, :2, :2] win_ac = win_ac[:, np.ogrid[:2], np.argmin(win_ac[0], axis=-1)] ret[:, 0, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)] win_ac = ac[:, :2, -2:] win_ac = win_ac[:, np.ogrid[:2], np.argmin(win_ac[0], axis=-1)] ret[:, 0, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)] win_ac = ac[:, -2:, -2:] win_ac = win_ac[:, np.ogrid[:2], np.argmin(win_ac[0], axis=-1)] ret[:, -1, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)] win_ac = ac[:, -2:, :2] win_ac = win_ac[:, np.ogrid[:2], np.argmin(win_ac[0], axis=-1)] ret[:, -1, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)] return ret

Return is an array (2, rows, cols) that can be unpacked into two arrays:

 >>> a = np.random.randint(100, size=(5,5)) >>> c = np.random.randint(100, size=(5,5)) >>> a array([[42, 54, 18, 88, 26], [80, 65, 83, 31, 4], [51, 52, 18, 88, 52], [ 1, 70, 5, 0, 89], [47, 34, 27, 67, 68]]) >>> c array([[94, 94, 29, 6, 76], [81, 47, 67, 21, 26], [44, 92, 20, 32, 90], [81, 25, 32, 68, 25], [49, 43, 71, 79, 77]]) >>> neighbor_min(a, c) array([[[42, 18, 18, 4, 4], [42, 18, 18, 4, 4], [ 1, 1, 0, 0, 0], [ 1, 1, 0, 0, 0], [ 1, 1, 0, 0, 0]], [[94, 29, 29, 26, 26], [94, 29, 29, 26, 26], [81, 81, 68, 68, 68], [81, 81, 68, 68, 68], [81, 81, 68, 68, 68]]])

The OP case could be resolved as:

 def bd_from_ac(a, c): b,d = neighbor_min(a, c) return a*b, d

And while there is a serious performance hit, it's pretty fast:

 In [3]: a = np.random.rand(1000, 1000) In [4]: c = np.random.rand(1000, 1000) In [5]: %timeit bd_from_ac(a, c) 1 loops, best of 3: 570 ms per loop

In fact, you are not using the coordinates of the minimum neighboring element for anything else, except that you extract it, so you can skip this part and create the min_neighbor function. If you do not want to resort to using cython for a fast loop, you will have to go with viewing the transitions, for example, in the Paul link. This usually converts your array (m, n) into a representation (m-2, n-2, 3, 3) the same data, and then you apply np.min to the last two axes.

Unfortunately, you have to apply it on one axis at a time, so you have to create a copy of the data (m-2, n-2, 3) . Fortunately, you can calculate at least two steps, first finish and minimize along one axis, and then along the other and get the same result. Thus, in most cases, you will have an intermediate storage the size of your input. If necessary, you can reuse the output array as an intermediate storage and avoid memory allocation, but this remains as an exercise ...

This function performs the following function. This is quite long because it has to deal not only with the central area, but also with special cases of four edges and four corners. In addition, this is a fairly compact implementation:

 def neighbor_min(a): rows, cols = a.shape ret = np.empty_like(a) # Fill in the center win_a = as_strided(a, shape=(m-2, n, 3), strides=a.strides+a.strides[:1]) win_a = win_a.min(axis=2) win_a = as_strided(win_a, shape=(m-2, n-2, 3), strides=win_a.strides+win_a.strides[1:]) ret[1:-1, 1:-1] = win_a.min(axis=2) # Fill the top, bottom, left and right borders win_a = as_strided(a[:2, :], shape=(2, cols-2, 3), strides=a.strides+a.strides[1:]) ret[0, 1:-1] = win_a.min(axis=2).min(axis=0) win_a = as_strided(a[-2:, :], shape=(2, cols-2, 3), strides=a.strides+a.strides[1:]) ret[-1, 1:-1] = win_a.min(axis=2).min(axis=0) win_a = as_strided(a[:, :2], shape=(rows-2, 2, 3), strides=a.strides+a.strides[:1]) ret[1:-1, 0] = win_a.min(axis=2).min(axis=1) win_a = as_strided(a[:, -2:], shape=(rows-2, 2, 3), strides=a.strides+a.strides[:1]) ret[1:-1, -1] = win_a.min(axis=2).min(axis=1) # Fill the corners ret[0, 0] = a[:2, :2].min() ret[0, -1] = a[:2, -2:].min() ret[-1, -1] = a[-2:, -2:].min() ret[-1, 0] = a[-2:, :2].min() return ret

Now you can do things like:

 >>> a = np.random.randint(10, size=(5, 5)) >>> a array([[0, 3, 1, 8, 9], [7, 2, 7, 5, 7], [4, 2, 6, 1, 9], [2, 8, 1, 2, 3], [7, 7, 6, 8, 0]]) >>> neighbor_min(a) array([[0, 0, 1, 1, 5], [0, 0, 1, 1, 1], [2, 1, 1, 1, 1], [2, 1, 1, 0, 0], [2, 1, 1, 0, 0]])

And your original question can be resolved as:

 def bd_from_ac(a, c): return a*neighbor_min(a), neighbor_min(c)

As a benchmark for performance:

 In [2]: m, n = 1000, 1000 In [3]: a = np.random.rand(m, n) In [4]: c = np.random.rand(m, n) In [5]: %timeit bd_from_ac(a, c) 1 loops, best of 3: 123 ms per loop

Numpy: quick calculations that take into account the neighbors of objects and their position inside the array

More articles: