Why can't member variables be shared?

I would like to create an instance of a class in CUDA code that shares some of its elements with other threads in one block.

However, when I try to compile the following code, I get an error: "the attribute" shared "does not apply here" (nvcc version 4.2).

class SharedSomething { public: __shared__ int i; // this is not allowed }; __global__ void run() { SharedSomething something; } 

What is the reason for this? Is there a way to achieve the desired behavior (common class members through one block)?

+6
source share
2 answers

Growth explained the rationale for the restriction. To answer the second part of the question, a simple workaround is to provide the kernel with declared shared memory and initialize a pointer to it that belongs to the class, for example. in the class constructor. Example.

 class Foo { public: __device__ Foo(int *sPtr) : sharedPointer(sPtr, gPtr) { sharedPointer[threadIdx.x] = gPtr[blockIdx.x * blockDim.x + threadIdx.x]; __syncthreads(); } __device__ void useSharedData() { printf("my data: %f\n", sharedPointer[threadIdx.x]); } private: int *sharedPointer; }; __global__ void example(int *gData) { __shared__ int sData[BLOCKDIM]; Foo f(sData, gData); f.useSharedData(); } 

Caution: the code written in the browser is unverified, unverified (and this is a trivial example, but the concept applies to real code), I used this technique myself).

+6
source

Objects marked as __shared__ are in shared memory allocated for each thread. It has a limited size and has the same service life as the flow unit.

So this is why you cannot declare members of a class as generic - their lifespan is not controlled by the class instance, but by the flow block. Perhaps members of the static class may be split, but not check it.

See the CUDA Programming Guide , Section B.2.3 for details.

+7
source

Source: https://habr.com/ru/post/926836/


All Articles