Effective 2D array reduction in CUDA?

Question

Effective 2D array reduction in CUDA?

The CUDA SDK has sample code and presentation slides for effective one-dimensional reduction. I also saw several works on the introduction of one-dimensional abbreviations and prefix scans in CUDA.

Is there an effective CUDA code to reduce a dense two-dimensional array ? Pointers to the code or related documents will be appreciated.

+4

matrix reduce cuda

Bradford larsen Aug 4 '10 at 0:52

source share

2 answers

KoppeKTop · Answer 1 · 2010-08-05T19:20:51+0000

I don’t know what exactly is the problem that you are trying to solve, but in fact you could just think of the 2D array as a long 1D array and use the SDK code to reduce the work. Simple arrays in CUDA are just 1D blocks of memory with special addressing rules - why don't you take this opportunity.

Anycorn · Answer 2 · 2010-08-04T05:32:35+0000

matrix reduction can be somewhat easier to implement, since vector / row reduction of a vector can be performed independently. You can allow each thread to process a column / row (depending on the main orientation of the matrix) and coalesce the reading in memory. I doubt that you can buy more performance without resorting to a texture / permanent cache where terrain can become important.

Effective 2D array reduction in CUDA?

More articles: