Efficient way to calculate three-dimensional indices from a 1D array view

Question

Efficient way to calculate three-dimensional indices from a 1D array view

I have 3D data that is stored in a 1D array. I compute 1D indices as follows:

index = i + j * WIDTH + k * WIDTH * HEIGHT

Than I need to return the original indices i,j,k from index . The obvious way to do this is something like this:

 k = index / (WIDTH * HEIGHT) j = (index % (WIDTH * HEIGHT)) / WIDTH i = index - j * WIDTH - k * WIDTH * HEIGHT

But I wonder if there is an even more efficient way to do this? At least without a module ...

The context of this question. I have a core in CUDA where I access the data and calculate the indices i, j, k ( index corresponds to a unique identifier for the stream). So maybe there is some specific way to CUDA? I think this is a fairly common problem, but I could not find a better way to do this ...

Thanks for your ideas!

+4

c ++ math cuda

Jaa-c Dec 15 '12 at 16:33

source share

3 answers

Try to round your measurements to the next power of two. Then you can use bit shifts and masks instead of multiplications, divisions and modulo.

 index = i | (j | k << HEIGHT_BITS) << WIDTH_BITS; k = index >> (WIDTH_BITS + HEIGHT_BITS); j = (index >> WIDTH_BITS) & ((1 << HEIGHT_BITS) - 1); i = index & ((1 << WIDTH_BITS) - 1);

+5

Mvg Dec 15 '12 at 17:17

source share

Only for cases where dimensions are valid 2. Use a bitmask. For example, if the maximum value of the 1st index is 4, it should take the 1st 2nd bit index.

+1

Leonid Volnitsky Dec 15 '12 at 17:14

source share

Jonathan dursi · Accepted Answer · 2012-12-17T01:02:38+0000

You are all right; if you want to avoid modulation (since it is very expensive on gpus), you can just do with j what you did with i :

 j = (index - (k*WIDTH*HEIGHT))/WIDTH

If you want the logic to be a little clearer and not need the original index , you can do

 k = index/(WIDTH*HEIGHT); index -= k*WIDTH*HEIGHT; j = index/WIDTH; index -= j*WIDTH; i = index/1;

which then quite easily extends to arbitrary sizes. You can try changing the settings above by performing tasks such as precomputing WIDTH*HEIGHT , say, but I would just turn on the optimization and trust the compiler to do this for you.

The proposals for rounding to degree 2 are correct in the sense that this will speed up the calculation of the index, but at a rather significant price. In this (not so bad) case WIDTH=HEIGHT=100 this will increase the memory requirements of your 3D array by 60% ( WIDTH=HEIGHT=128 ), and the memory on the GPU is usually already dense; and creating arrays twice can lead to problems with bank conflicts, depending on your access patterns.

Efficient way to calculate three-dimensional indices from a 1D array view

More articles: