Does nVidia RDMA GPUDirect always support only physical addresses (in the physical address space of the CPU)?

Question

Does nVidia RDMA GPUDirect always support only physical addresses (in the physical address space of the CPU)?

As you know: http://en.wikipedia.org/wiki/IOMMU#Advantages

Peripheral memory can be supported by IOMMU . Peripheral using a page request PCI-SIG PCIe Address Translation Services (ATS) Extension Interface (PRI) can detect and signal the need for memory manager services.

enter image description here

But when we use the nVidia GPU with CUDA> = 5.0, we can use the RDMA GPUDirect and know that:

http://docs.nvidia.com/cuda/gpudirect-rdma/index.html#how-gpudirect-rdma-works

Traditionally, resources such as BAR windows are mapped to a user or kernel address space using the CPU MMU as the memory-mapped I / O address (MMIO). However, since current operating systems do not have sufficient mechanisms to exchange MMIO areas between drivers , the NVIDIA kernel driver exports functions to perform the necessary address translations and mappings.

http://docs.nvidia.com/cuda/gpudirect-rdma/index.html#supported-systems

RDMA for GPUDirect currently depends on all physical addresses equally in terms of PCI devices. This makes it incompatible with IOMMU, and therefore they must be disabled for RDMA for GPUDirect to work.

And if we allocate and map CPU-RAM to UVA, like here:

#include <iostream> #include "cuda_runtime.h" #include "device_launch_parameters.h" int main() { // Can Host map memory cudaSetDeviceFlags(cudaDeviceMapHost); // Allocate memory unsigned char *host_src_ptr = NULL; cudaHostAlloc(&host_src_ptr, 1024*1024, cudaHostAllocMapped); std::cout << "host_src_ptr = " << (size_t)host_src_ptr << std::endl; // Get UVA-pointer unsigned int *uva_src_ptr = NULL; cudaHostGetDevicePointer(&uva_src_ptr, host_src_ptr, 0); std::cout << "uva_src_ptr = " << (size_t)uva_src_ptr << std::endl; int b; std::cin >> b; return 0; }

We get equal pointers in Windwos7x64, which means that cudaHostGetDevicePointer() does nothing:

host_src_ptr = 68719476736
uva_src_ptr = 68719476736

What does it mean "sufficient mechanisms for exchanging MMIO areas between drivers", what mechanism is intended here, and why can I not use IOMMU, using a virtual address to access via PCIe to the physical area of the BAR - another memory card device via PCIe?

And does this mean that the RDMA GPUDirect always works only with physical addresses (in the physical address space of the CPU), but why do we send uva_src_ptr , which is equal to host_src_ptr , to the host_src_ptr , a simple pointer to the CPU's virtual address space?

+7

gpgpu cuda memory-mapping pci-e gpudirect

Alex Nov 07 '13 at 16:50

source share

1 answer

datboi · Accepted Answer · 2013-11-21T23:27:00+0000

IOMMU is very useful in that it provides a set of display registers. It can organize access to any physical memory within the range of addresses available to the device, and this can lead to the fact that physically scattered buffers will look adjacent to the devices. This is not good for third-party PCI / PCI-Express cards or remote computers trying to access the raw physical offset of the nVidia GPU, as this may lead to the fact that they will not really have access to the alleged memory areas or restrict / restrict access to him for one based on the IOMMU module. This must be disabled because

"RDMA for GPUDirect currently relies on all physical addresses that are the same in terms of PCI devices .
-nVidia, Design Considerations for rDMA and GPUDirect

When drivers try to use the CPU MMU and map memory I / O (MMIO) areas for use in kernel space, they usually store the returned address from memory into themselves. Since each driver operates within its own context or namespace, exchanging these mappings between nVidia drivers and other third-party vendor drivers who want to support rDMA + GPUDirect will be very complex and will lead to a vendor-specific solution (possibly even product-specific, if drivers vary widely between third-party products). In addition, operating systems today do not currently have a good solution for exchanging MMIO mappings between drivers, so nVidia exports several functions that allow third-party drivers to easily access this information from the kernel space itself.

nVidia uses the use of "physical addressing" to access each card through rDMA for GPUDirect. This greatly simplifies the process of moving data from one computer to a remote PCI-Express bus using the physical addressing scheme of the computer, without worrying about the problems associated with virtual addressing (for example, resolving virtual addresses for physical addresses). Each card has a physical address on which it is located, and it can be accessed with this offset; only a small bit of logic should be added to a third-party driver trying to perform rDMA operations. In addition, these 32- or 64-bit base address registers are part of the standard PCI configuration space, so the physical address of the card can be easily obtained simply by reading the BAR from it, instead of getting the mapped address obtained by the nVidia driver when attaching to the card. NVidia's Universal Virtual Addressing (UVA) performs the above mappings of physical addresses into a seemingly adjacent memory area for user-space applications, for example:

These memory areas are further divided into three types: CPU, GPU and FREE, which are all documented here .

Back to your use case: since you are in user space , you do not have direct access to the physical address space of the system, and the addresses that you use are probably the virtual addresses provided to you by nVidia UVA. Assuming no previous allocations have been made, your memory allocation should be at offset + 0x00000000, which will cause you to see the same GPU offset. If you allocated a second buffer, I assume that you will see that this buffer starts immediately after the end of the first buffer (with an offset of + 0x00100000 from the base virtual address of the GPU in your case with a distribution of 1 MB).

However, if you were in kernel space and wrote a driver for your corporate card to use rDMA for GPUDirect, you would use the 32- or 64-bit physical addresses assigned to the GPU from the system BIOS and / or OS to the rDMA data directly to and with GPU.

In addition, it may be worth noting that not all DMA modules actually support virtual addresses for transmissions - in fact, most of them require physical addresses, since it can be difficult to process virtual addresses from a DMA engine (p. 7), therefore many DMA mechanisms lack support for this.

To answer the question from your message header, nVidia currently only supports physical addressing for rDMA + GPUDirect in kernel space . For the user application space , you will always use the virtual GPU address provided to you by nVidia UVA, which is located in the virtual address space of the CPU.

In relation to your application, a simplified process breakdown that you can perform for rDMA operations:

The user space application creates buffers that enter the amount of unified virtual addressing space. nVidia provides (virtual addresses).
Call cuPointerGetAttribute(...) to get P2P tokens; these tokens refer to memory within the CUDA context.
Send all this information to the kernel space in some way (e.g. IOCTL, read / write to the driver, etc.). At a minimum, you want these three things to end up in your core space :
- P2P icon returned by cuPointerGetAttribute(...)
- Virtual address of UVA buffer (s)
- Buffer Size (s)
Now translate these virtual addresses to their corresponding physical addresses by calling the nVidia kernel functions, since these addresses are stored in the nVidia page tables and can be accessed with the nVidia exported function, for example: nvidia_p2p_get_pages(...) , nvidia_p2p_put_pages(...) , and nvidia_p2p_free_page_table(...) .
Use these physical addresses obtained in the previous step to initialize your DMA mechanism, which will manage these buffers.

A more detailed explanation of this process can be found here .

Does nVidia RDMA GPUDirect always support only physical addresses (in the physical address space of the CPU)?

More articles: