The value of bandwidth in CUDA and why it matters

Question

The value of bandwidth in CUDA and why it matters

The CUDA Programming Guide states that

"Bandwidth is one of the most important factors affecting performance. Almost all code changes should be made in the context of how they affect bandwidth."

Next, the theoretical throughput is calculated, which is about hundreds of gigabytes per second. I do not understand why the number of bytes that can be read / written to global memory is a reflection of how well the kernel is optimized.

If I have a kernel that intensively calculates data stored in shared memory and / or registers, with only one reading at the beginning and writing out at the end of and into global memory, of course, the effective bandwidth will be small, while the kernel itself Can be very effective.

Can anyone else explain the bandwidth in this context?

thank

+5

optimization memory cuda bandwidth

zenna Mar 04 '10 at 17:25

source share

3 answers

, . , , , . , , , .

Advanced CUDA C , , . CUDA Best Practices Gude , CUDA ( NVIDIA).

+1

Tom 05 . '10 8:25

Usually the kernels are quite small and simple and perform the same operation on a large amount of data. You can have many cores that you invoke sequentially to perform a more complex operation (think of it as a processing pipeline). Obviously, the throughput of your pipeline will depend on how efficient your cores are and on the limited use of memory bandwidth.

0

Paul r Mar 04 '10 at 17:52

source share

Anycorn · Accepted Answer · 2010-03-04T19:47:11+0000

, , . , , / mmany.

, , . , .

The value of bandwidth in CUDA and why it matters

More articles: