How to make efficient two-dimensional convolution on large arrays

Question

How to make efficient two-dimensional convolution on large arrays

I have a problem when I need to collapse one very large 2D array (file on disk) with a smaller array that fits into memory. scipy.signal.fftconvolvegreat when arrays fit into memory but don't help when they don't. Is there any other reasonable approach besides loops over all points in each array to calculate the convolution manually? I am not good at math, but I wonder if it is possible to divide fftconvolveinto parts and combine with a little overlap? Something else?

+4

scipy numerical-methods convolution

Rich Oct 16 '14 at 16:47

source share

3 answers

Fothering heltonbiker, , . "". , memmapped... memmapped , .

.

#you need to know how big your data matrix is

#todo: assign nrows and ncols based on your data.

fp = np.memmap('your_new_memmap_filename.dat', dtype='float32', mode='w+', shape=(nrows, ncols))#create file with appropriate dimensions

data = open('yourdatafile.txt', 'r')
i = 0
for line in data:
    arr = map(float, line.strip().split(','))   #comma delimited? header row?
    fp[i, :] = arr
    i += 1

del fp  #see documentation...del will flush the output and close the file.

.. script

convolve_matrix = somenumpyarray
fp_result = np.memmap('your_results_memmap_filename.dat', dtype='float32', mode='w+', shape=(nrows, ncols))

#open big file read only
fp = np.memmap('your_new_memmap_filename.dat', dtype='float32', mode='r', shape=(nrows, ncols))

chunksize = 10000 #?
for i in range(int(nrows/chunksize) - 1):  #don't forget the remainder at the end
    chunk = fp[i * chunksize: (i + 1) * chunksize, :]
    res = fftconvolve(chunk, convolve_matrix)
    fp_result[i * chunksize: (i + 1) * chunksize, :] = res

#deal with remainder

del fp_result
del fp

, , . , , , Joblib . https://pythonhosted.org/joblib/parallel.html , , 2- tiler/reassembler, gis, . , tiler , , , , ( ), ( ) ( , ) ... , . .

for source_slice, result_slice, mini_slice in zip(source_slice, result_slice, mini_slice):
    matrix2convolve = big_fp[source_slice[0]:source_slice[1], :]
    convolve_result = fftconvolve(matrix2convolve, convolve_matrix)
    big_result_fp[result_slice[0]:result_slice[1], :] = convolve_result[mini_slice[0]:mini_slice[1], :]

+1

user1269942 16 . '14 17:52

, 100% , .

, , . . , .

, , , , . , . . , .

, , , ( ), . http://godsnotwheregodsnot.blogspot.com/2015/02/combining-convolution-kernels.html

http://pastebin.com/bk0A2Z5D

, . , . , , . , , , :

0,0
0,1

, , - , . , , , , , . , , .

- , , , . - .

-1

Tatarize 21 . '15 9:10

heltonbiker · Accepted Answer · 2014-10-16T16:58:35+0000

I can offer you two different approaches (although I will not risk providing an example code, I hope you do not mind understanding it):

1) Use numpy.memmap; "Memory mapped files are used to access small segments of large files on disk without reading the entire file in memory. (...) The memmap object can be used wherever ndarray is accepted."

2) Divide the large array into tiles, convolve with mode='full'and overlay the results. For each tile, you will get a “border” around the tile with the same width of your convolution kernel.

You can combine both approaches (for example, read fragments from a tagged file and overlay the results in another file with memmapped capability, which is the result).

How to make efficient two-dimensional convolution on large arrays

More articles: