Different cores for different architectures

I am wondering if there is an easy way to have different kernel versions for different architectures. Is this their easy way? or the only way is to define independent kernels in independent files and ask nvcc to compile for different architectures into a file?

+7
cuda
source share
2 answers

You can do this using compiler directives. Something like

__global__ void kernel(...) { # if __CUDA_ARCH__ >= 350 do something # else do something else # endif } 
+8
source share

With a little more C ++, JackOLanterns Answer is slightly modified:

 template <unsigned int ARCH> __global__ void kernel(...) { switch(ARCH) { case 35: do something break; case 30: do something else break; case 20: so something else break; default: do something for all other ARCH break; } } 

EDIT: to remove the error @ sgar91 indicated:

you can invoke the kernel using the configuration items of your CUDA device requested through

 cudaGetDeviceProperties(&props, devId); unsigned int cc = props.major * 10 + props.minor; switch(cc) { case 35: kernel<35><<<1, 1>>>(/* args */); break; ... } 
0
source share

All Articles