CPU and data alignment

Question

CPU and data alignment

Sorry if you feel you have answered this many times, but I need answers to the following questions!

Why should the data be aligned (at the boundaries of 2 bytes / 4 bytes / 8 bytes)? Here I doubt that if the processor has address lines Ax Ax-1 Ax-2 ... A2 A1 A0, then it is quite possible to access memory cells in sequence. So why is it necessary to align data at specific boundaries?
How to find alignment requirements when I compile code and generate an executable?
If, for example, data alignment is a 4-byte boundary, does this mean that each subsequent byte is located at an offset of modulo 4? I doubt that if the data is aligned with 4 bytes, does this mean that if the byte is at 1004, then the next byte is at 1008 (or at 1005)?

+13

c alignment cpu-architecture processor

MS. Jun 11 '10 at

source share

8 answers

Very little data has for alignment. Moreover, some data types may work better or some processor operations require some data alignment.

First of all, let's say you read 4 bytes of data at a time. Let them also say that your processor has a 32-bit data bus. Let them also say that your data is stored in byte 2 in the system memory.

Now, since you can load 4 bytes of data at once, it makes little sense for your address register to point to one byte. By forcing an address registration point for every 4 bytes, you can manipulate the data 4 times. This way your processor can only read data starting from bytes 0, 4, 8, 12, 16, etc.

So here is the problem. If you want the data to start with byte 2, and you read 4 bytes, then half of your data will be at address 0 and the other half at position 1.

So basically you would be in memory twice to read your 4-byte data element. Some processors do not support this operation (or force you to load and combine two results manually).

Go here for more details: http://en.wikipedia.org/wiki/Data_structure_alignment

+6

Timothy Baldridge Jun 11 '10 at 18:22

source share

1.) Some architectures do not have this requirement at all, some encourage alignment (when accessing data items that are not related to alignment) there is a speed limit, and some can strictly enforce it (an incorrect definition causes a processor exception).
Many of today's popular architectures fall into the penalties category. CPU developers had to make deals between flexibility / performance and cost (silicon area / number of control signals needed for bus cycles).

2.) What language, what architecture? Refer to the compiler guide and / or processor architecture documentation.

3.) Again, this is entirely architecture dependent (some architectures may not allow access to byte size elements at all or have a bus width that is not even a multiple of 8 bits). Therefore, if you do not ask about a specific architecture, you will not receive useful answers.

+4

Durandal Jun 11 '10 at 18:28

source share

In general, one answer to all three questions: "it depends on your system." A few more details:

Your memory system may not be addressable. In addition, you may incur a performance penalty so that your processor has access to unrelated data. Some processors (for example, older ARM chips, for example) simply cannot do this.
Read the manual for your processor and any ABI specifications for which your code is generated,
Usually, when people refer to data that is in a certain alignment, this only applies to the first byte. Therefore, if the ABI specification says that “the data structure X must be aligned by 4 bytes”, this means that X must be stored in memory at an address that is divisible by 4. Nothing is implied by this status about the size or internal layout of the structure X .
As for your specific example, if the data is aligned by 4 bytes, starting at address 1004, the next byte will be 1005.

+2

Carl Norum Jun 11 '10 at 18:18

source share

It completely depends on the processor you use!

Some architectures deal only with 32 (or 36!) Bit words, and you need special instructions for loading single characters or the word haalf.

Some processors (especially PowerPC and other IBM Risc chips) do not care about alignments and will load integers from odd addresses.

For most modern architectures, you need to align integers with word boundaries and long integers to double word boundaries. This simplifies the circulation for register downloads and speeds up what has ever been so boring.

+2

James Anderson Oct 11 '10 at 1:54

source share

To improve processor performance, data alignment is required. Intel website provides detailed information on how to align data in memory

Aligning data when migrating to 64-bit Intel® architecture

One of them is the alignment of data elements - their location in memory in relation to addresses that are multiples of four, eight or 16 bytes. According to Intel's 16-bit architecture, data alignment had little impact on performance, and its use was completely optional. According to IA-32, proper data alignment can be an important optimization, although its use is still optional, with very few exceptions, where proper alignment is mandatory. However, a 64-bit environment places more stringent requirements on data elements. Wrong objects cause program exceptions. For an element to be correctly aligned, it must meet the requirements of the 64-bit Intel architecture (discussed in the near future), as well as the requirements of the linker used to create the application.
The basic rule of data alignment is that the safest (and most widely supported) approach is based on Intel's meaning of "natural boundaries." These are those that occur when rounding the size of a data item to the next largest size of two, four, eight, or 16 bytes. For example, a 10-byte float should be aligned at a 16-byte address, while 64-bit integers should be aligned at an eight-byte address. Since this is a 64-bit architecture, the pointer sizes are eight bytes wide, so they should also be aligned along eight-byte boundaries.
It is recommended that all structures larger than 16 bytes be aligned at 16-byte boundaries. In general, for best performance, align the data as follows:
Align 8-bit data to any address
Align 16-bit data to be contained in a aligned four-byte word
Align 32-bit data so its base address is a multiple of four
Align 64-bit data so its base address is a multiple of eight
Align 80-bit data so that its base address is a multiple of sixteen.
Align 128-bit data so that its base address is a multiple of sixteen.
A structure or data array of 64 bytes or more in size must be aligned so that its base address is a multiple of 64. Sorting data while decreasing the size order is one heuristic to support natural alignment. While boundaries with 16 bytes (and cache lines) never intersect, natural alignment is not strictly necessary, although this is an easy way to ensure that general alignment recommendations are followed.
Correct alignment of data within structures can lead to data bloating (due to the filling necessary for proper placement of fields), therefore, when it is necessary and possible, it is useful to reorganize structures so that the fields requiring the widest alignment are the first in the structure. See the article "Preparing Code for the IA-64 Architecture (Code Cleanup)" for more information on resolving this problem.

+1

Shen liang Aug 18 '13 at 0:57

source share

For Intel Architecture, Chapter 4 DATA TYPES Intel 64 and IA-32 Architectives Software Developer's Guide answers your question 1.

+1

Jingguo Yao Dec 21 '13 at 2:56

source share

"Now, since you can load 4 bytes of data at once, it doesn't make much sense for your address register to point to one byte."

Why? Why can't I read positions 1, 2, 3, 4 at a time? I think this will not degrade performance and complicate the circuit?

-one

spockwang Oct 11 2018-10-10T00:

source share

Yann Ramin · Accepted Answer · 2010-06-11 18:21

Processors are focused on words, not bytes. In a simple processor, memory is usually configured to return a single word (32 bits, 64 bits, etc.) to a string address, where the bottom two (or more) address strings usually do not contain bits.

Intel processors can perform calls at non-dictionary boundaries for many instructions, but there is a performance limitation, because inside the CPU it performs two memory accesses and a mathematical operation to load one word. If you read bytes, alignment does not apply.

Some CPUs (ARM or Intel SSE instructions) require aligned memory and have an undefined operation when performing unrelated calls (or an exception). They save significant silicone space without implementing the much more sophisticated loading / storage subsystem.

The alignment depends on the processor word size (16, 32, 64 bits) or SSE in the SSE register (128 bits).

For your last question, if you load one byte of data at a time, most processors have no alignment restrictions (some DSPs do not have byte level instructions, but most likely you will not run into them).

CPU and data alignment

More articles: