CUDA: Host bandwidth to device above peak PCIe power?

I used the same plot as the attached one for another question. You can see that the maximum bandwidth is more than 5.5 GB / s. I use the NVidia bandwidth testing program from the sample code to find the bandwidth between the host and the device, and vice versa. The system consists of 12 Intel Westmere processors on two sockets, 4 Tesla C2050 GPUs with 4 PCIe Gen2 Express slots. Now the question is that the maximum bandwidth of the PCIe x16 Gen2 is 4 GB / s in one direction, and how is it that I get much more bandwidth when transferring a host to transferring a device? enter image description here

I mean, each PCIe is connected to the CPU through an I / O controller hub, which connects via QPI (much more b / w) to the CPU.

+4
source share
1 answer

The peak bandwidth of the PCIe x16 Gen2 is 8 GB / s in each direction. You do not exceed the peak.

+6
source

All Articles