What cards and computing capabilities are needed to fully utilize the features of CUDA 5

We have just received a stable version of CUDA 5. There are several new terms, such as Kepler, and the possibility of using MPI with better performance and working simultaneously with the same card with 32 applications. I am a little confused, although I am looking for answers to such questions:

  • What cards and computing capabilities are needed to take full advantage of the capabilities of CUDA 5?
  • New features are only available for the Kepler architecture, such as GPUDirect, Dynamic Parallelism, Hyper Q, and Dynamic Parallelism.
  • If we have Fermi architectures, what are the benefits of using CUDA 5. Does it bring benefits other than being able to use NSight on Linux and Eclipse. I think the most important feature is the ability to create libraries?
  • Did you see any performance improvements just by switching from CUDA 4 to CUDA 5. (I got some speedups on Linux machines)

I found out some documents like

However, a better, short description can make our thoughts clearer.

PS: Please do not limit the answer to the questions above. Perhaps I am missing some similar questions.

+6
source share
1 answer

Computing capability 3.5 (e.g. GK110) is required for dynamic parallelism because earlier GPUs do not have the hardware needed for threads to start cores or directly put other API calls into the hardware command queue.

Hyper-Q requires a computational capability of 3.5.

SHFL intrinsic properties require CC 3.0 (GK104)

Linking device code, NSight EE, nvprof, performance improvements, and bug fixes in CUDA 5 benefit Fermi and earlier GPUs.

+5
source

Source: https://habr.com/ru/post/928092/


All Articles