Memory alignment

I understood why the memory should be aligned with 4 bytes and 8 bytes based on the data width of the bus. But the following statement bothers me

"IoDrive requires that all I / O operations performed on the device using O_DIRECT must be 512-byte alligned and a multiple of 512 bytes."

What is the need for address alignment to 512 bytes.

+4
source share
3 answers

Blanket claims that blame DMA for large buffer alignment restrictions are incorrect.

Transmission of DMA equipment is usually aligned at 4 or 8 bytes, as the PCI bus can physically transmit 32 or 64 bits at a time. Besides this basic alignment, DMA hardware transfers are designed to work with any address provided.

However, the hardware deals with physical addresses, while the OS deals with virtual memory addresses (which is a protected mode construct in the x86 CPU). This means that the adjacent buffer in the process space may not be adjacent in the physical RAM. If you do not care about creating physically adjacent buffers, DMA transfer must be broken down at the borders of VM pages (usually 4K, possibly 2M).

As for the buffers that need to be matched with the size of the disk sector, this is completely wrong; DMA hardware does not completely pay attention to the size of the physical sector on the hard disk.

In Linux 2.4 O_DIRECT, 4K alignment is required, under 2.6 it was weakened to 512B. In any case, it was probably a constructive solution to prevent overlapping updates of one sector from the borders of the VM page and, therefore, to transmit split DMA transmissions. (An arbitrary 512B buffer has a 1/4 probability of crossing a 4K page).

So, although the OS is to blame, not the hardware, we can see why paging buffers are more efficient.

Edit: Of course, if we write large buffers in any case (100KB), then the number of VM page borders crossed will be almost the same, regardless of whether we are aligned with 512B or not. Thus, the main case that is optimized using 512B alignment is transmission in one sector.

+5
source

Typically, such large alignment requirements are associated with basic DMA . Large block locks can sometimes be performed much faster, requiring much stronger alignment restrictions than what you have here.

On multiple ARM processors, the first row of the translation table should be aligned at the 16 KB border!

+4
source

If you do not know what you are doing, do not use O_DIRECT.

O_DIRECT means "direct access to the device." This means that it bypasses all OS caches, directly getting to the disk (or, possibly, a RAID controller, etc.). Access to disks is based on each sector.

EDIT: Alignment requirements for offset / size I / O; this is usually not a memory alignment requirement.

EDIT: If you look at this page (this is the only hit) it also says that the memory should be aligned on the page.

0
source

All Articles