GPU - display system memory

How is system memory (RAM) mapped to access the GPU? I understand how virtual memory works for a processor, but I'm not sure how it will work when the GPU accesses the system memory with a GPU (host). Basically, something has to do with how data is copied from system memory to host memory and vice versa. Can you please provide explanations supported by the help articles?

+8
io architecture gpu hardware computer-architecture
source share
1 answer

I found the following slideshow quite useful: http://developer.amd.com/afds/assets/presentations/1004_final.pdf

MEMORY SYSTEM ON FUSION APUS Benefits of Zero Copy Pierre Bodier AMD OpenGL / OpenCL Member Graham AMD Sellers OpenGL Manager

AMD Fusion Developer Summit June 2011

Remember, however, that this is a fast moving area. Not so much the development of new concepts as the (finally) application of concepts such as virtual memory for GPUs. Let me summarize.

In the old days, say, until 2010, GPUs were usually separate PCI or PCI cards or boards. They had DRAM on the GPU board. This built-in DRAM is pretty fast. They can also access processor-side DRAM, typically through DMA copy mechanisms through PCI. Graphic access to processor memory, like this, is usually quite slow.

GPU memory was not unloaded. In this case, the GPU memory usually does not close, except for those managed by software caches inside the GPU, such as texture caching. "Software is managed" means that these caches are not cache-coherent and must be manually cleared.

As a rule, access to the GPU is only an aperture. As a rule, it was fixed - not subject to search processing. Usually it’s not even converted to a virtual address - usually a virtual address = physical address +, possibly some kind of bias.

(Of course, the rest of the processor’s memory is, in fact, virtual memory, unloaded, certainly translatable and cached. It’s just that the GPU cannot access this safely, because the GPU (no) does not have access to the virtual memory subsystem and the system cache coherency.

Now, the above works, but it's a pain. When working on something first inside the CPU, inside the GPU it works slowly. Mistake. And also a big security risk: the user provided a GPU code, which often could receive (slowly and insecurely) the entire DRAM CPU, therefore it could be used by malware.

AMD has announced its intention to integrate GPUs and processors more closely. One of the first steps was the creation of Fusion APUs, chips containing both processors and GPUs. (Intel did the same with Sandybridge, and I expect ARM to do that too.)

AMD also announced that it intends to use the GPU virtual memory subsystem and use caches.

A step towards using virtual memory of the GPU is AMD IOMMU. Intel is similar. Although IOMMUs are more virtual machine oriented than virtual memory for non-virtual machine OSs.

Systems in which the CPU and GPU are located within the same chip typically have a processor and a graphics processor to access the same DRAM chips. Thus, there is no more on-GPU-board and off-GPU-CPU DRAM.

But usually there is still a partition, a partition, DRAM on the system motherboard into memory, mainly used by the processor, and memory, mainly used by the GPU. Despite the fact that the memory may be inside the same DRAM chips, usually a large fragment is "graphics". In the document above, it is called "Local" memory for historical reasons. The CPU and graphics memory can be configured in different ways - usually the GPU memory has a lower priority, except for video updates and has longer bursts.

In the document to which I refer, there are different internal buses: onions for "system" memory and "garlic" for quick access to the graphic memory section. Petroglyph memory is usually not closed.

In this article, I will talk about how the CPU and GPU have different page tables. Their subtitles of “zero copy benefits” refer to mapping the CPU data source to the GPU page tables, so you don’t need to copy it.

Etc. etc.,

This area of ​​the system is developing rapidly, so the 2011 document is almost out of date. But you must note the trends

(a) WANTS software provides uniform access to CPU and GPU memory - virtual memory and cacheable

but

(b) although the hardware tries to provide (a), the functions of special graphics memory almost always make a dedicated graphics memory, even if it is simply a partition of the same DRAM, much faster or more energy efficient.

The gap may be narrowed, but each time you think that it is going to leave, another hardware trick can be played.

+11
source share

All Articles