Scipy Fast 1-D interpolation without any loop

Question

Scipy Fast 1-D interpolation without any loop

I have two 2D arrays, x (ni, nj) and y (ni, nj), which I need to interpolate along the same axis. I want to interpolate along the last axis for each ni.

I wrote

import numpy as np from scipy.interpolate import interp1d z = np.asarray([200,300,400,500,600]) out = [] for i in range(ni): f = interp1d(x[i,:], y[i,:], kind='linear') out.append(f(z)) out = np.asarray(out)

However, I think this method is inefficient and slow due to the loop if the size of the array is too large. What is the fastest way to interpolate a multidimensional array? Is there a way to do linear and cubic interpolation without a loop? Thanks.

+6

python numpy scipy matplotlib

Tetsuro Jan 28 '13 at 10:09

source share

3 answers

One optimization is to distribute an array of results once like this:

 import numpy as np from scipy.interpolate import interp1d z = np.asarray([200,300,400,500,600]) out = np.zeros( [ni, len(z)], dtype=np.float32 ) for i in range(ni): f = interp1d(x[i,:], y[i,:], kind='linear') out[i,:]=f(z)

This will save you some copying of the memory that happens in your implementation, which happens when you call out.append(...) .

+3

Dave Jan 28 '13 at 18:38

source share

Have you tried using two-dimensional interpolations from scipy ?

0

ev-br Jan 28 '13 at 10:28

source share

Jaime · Accepted Answer · 2013-01-28T23:05:22+0000

The proposed method has a python loop, so with large values of ni it will be slow. However, unless you have a large ni , you should not worry too much.

I created a sample input with the following code:

 def sample_data(n_i, n_j, z_shape) : x = np.random.rand(n_i, n_j) * 1000 x.sort() x[:,0] = 0 x[:, -1] = 1000 y = np.random.rand(n_i, n_j) z = np.random.rand(*z_shape) * 1000 return x, y, z

And they were tested using two versions of linear interpolation:

 def interp_1(x, y, z) : rows, cols = x.shape out = np.empty((rows,) + z.shape, dtype=y.dtype) for j in xrange(rows) : out[j] =interp1d(x[j], y[j], kind='linear', copy=False)(z) return out def interp_2(x, y, z) : rows, cols = x.shape row_idx = np.arange(rows).reshape((rows,) + (1,) * z.ndim) col_idx = np.argmax(x.reshape(x.shape + (1,) * z.ndim) > z, axis=1) - 1 ret = y[row_idx, col_idx + 1] - y[row_idx, col_idx] ret /= x[row_idx, col_idx + 1] - x[row_idx, col_idx] ret *= z - x[row_idx, col_idx] ret += y[row_idx, col_idx] return ret

interp_1 is an optimized version of your code, after Dave's answer. interp_2 is a vectorized implementation of linear interpolation that avoids any python loop. This type of encoding requires a clear understanding of broadcasting and indexing in numpy, and some things will be less optimized than what interp1d does. A striking example is the search for a hopper in which you can interpolate a value: interp1d will probably exit the loops sooner when it finds a bit, the above function compares the value with all cells.

Thus, the result will very much depend on what n_i and n_j , and even how long your array of z values to interpolate. If n_j small and n_i large, you should expect benefits from interp_2 and from interp_1 if it is the other way around. Smaller z should be an advantage for interp_2 , longer should be interp_1 .

I actually ran both approaches with different n_i and n_j , for z forms (5,) and (50,) , here are the graphs:

So, for z form (5,) you need to go with interp_2 whenever n_j < 1000 and interp_1 in another place. It is not surprising that the threshold for z differs from (50,) , now it is around n_j < 100 . It seems tempting to conclude that you should stick with your code if n_j * len(z) > 5000 , but change it to something like interp_2 above, if not, but there is a lot of extrapolation in this statement! If you want to continue experimenting, here is the code I used to create the graphs.

 n_s = np.logspace(1, 3.3, 25) int_1 = np.empty((len(n_s),) * 2) int_2 = np.empty((len(n_s),) * 2) z_shape = (5,) for i, n_i in enumerate(n_s) : print int(n_i) for j, n_j in enumerate(n_s) : x, y, z = sample_data(int(n_i), int(n_j), z_shape) int_1[i, j] = min(timeit.repeat('interp_1(x, y, z)', 'from __main__ import interp_1, x, y, z', repeat=10, number=1)) int_2[i, j] = min(timeit.repeat('interp_2(x, y, z)', 'from __main__ import interp_2, x, y, z', repeat=10, number=1)) cs = plt.contour(n_s, n_s, np.transpose(int_1-int_2)) plt.clabel(cs, inline=1, fontsize=10) plt.xlabel('n_i') plt.ylabel('n_j') plt.title('timeit(interp_2) - timeit(interp_1), z.shape=' + str(z_shape)) plt.show()

Scipy Fast 1-D interpolation without any loop

More articles: