Why add additions for several elements of these structures, and not for individual members?

Question

Why add additions for several elements of these structures, and not for individual members?

Why is the concept of addition only added when there are several elements of the structure and why is it not included when there is one main element of the data type?

if we look at a 32-bit machine

struct { char a; } Y;

There is no indentation, and sizeof Y is 1 byte.

If we look at this structure

 struct { char a; int b; } X;

Size X will be 8 bytes.

My question is: Why adding add-ons in the second case? If for efficient access to a machine that usually reads data in blocks with a multiple of 4 bytes, why in the first case there was no padding?

+7

c ++ c linux operating-system

Laavaa Oct 12 '12 at 14:47

source share

4 answers

This is not just efficiency.

The problem is not the size of access as such, but its alignment. On most machines that receive unbalanced data, crash, and on ordinary machines today, int will need an address aligned on a four-byte boundary: access to an int whose address is not aligned on a four-byte boundary will either slow down the program significantly or cause it to crash. Your first structure did not contain any data with alignment considerations, so no indentation is required. Your second one has an int , and the compiler needs to make sure that the array of them, all int are correctly aligned. This means that 1) the total size of the structure must be a multiple of four, and 2) the offset int in the structure must be a multiple of four. (Given the first requirement:

 struct S { char a; int b; char c; };

usually has a size of 12, with padding after both char .)

In other languages, the compiler often had to reorder structures so that the elements with the most stringent alignment requirements came first, for struct S above, this would lead to:

 struct S { int b; char a; char c; };

and size 8, not 12. However, C and C ++ prohibit this.

+3

James kanze Oct 12 '12 at 15:02

source share

Filling is done to align certain types of data, that is, to ensure that data of a certain type has an address that is a multiple of some specified number. This varies depending on different CPU models, but often 2-byte integers are aligned at addresses that are multiplexes of 2 and 4-byte integers to addresses that are a multiple of 4. characters usually don't need to be aligned.

So, if there is only one field in the structure, then as long as the structure is located at the address with the corresponding border, there is no need to fill out. And it will always be: the system always aligns the blocks with the largest boundary that will ever be needed, usually 4 bytes or 8 bytes. One thing in the structure will be on the right border. The problem arises only if there are several fields, since the length of one field may not lead to the fact that the next field will be at the corresponding boundary. So, in your example, you have a char, which, of course, takes 1 byte, and an int, which takes 4. Suppose that the structure is located at 0x1000. Then, without padding, char will be placed at 0x1000, and int at 0x1001. But ints are more efficient when on 4-byte boundaries, so the compiler adds some pad bytes to push it to the next such border, 0x1004. So now you have char (1 byte), padding (3 bytes), int (4 bytes), a total of 8 bytes.

In this case, you can do nothing to improve the situation. Each structure will be aligned with a 4- or 8-byte boundary, so when the minimum is 5 bytes, in practice it will always be rounded to 8. (Sizeof will not display the indent between structures, only inside, but memory is still lost.)

In other cases, you can minimize the number of extra bytes of an element by reordering the order of the fields. For example, you had three char and three int. If you declare the structure as

 struct {char a; int b; char c; int d; char e; int f;}

then the compiler will add 3 bytes after the first char to align the first int, and then three more bytes after the second char to align the second int. (1) + pad (3) + int (4) + char (1) + pad (3) + int () 4) = 24.

But if you stated this:

 struct {char a; char c; char e; int b; int d; int f;}

then you get char (1) + char (1) + char (1) + pad (1) + int (4) + int (4) + int (4) = 16.

A few years ago I read a tip to always add the largest elements to minimize padding, i.e. first set longs, then ints, then shorts, then chars.

If you allocate thousands or millions, you can save a lot of memory by this technique. If you are going to select only one or two, it will not make much difference.

0

Jay Oct 12 '12 at 15:08

source share

Padding is the concept of alignment , for issue of computer efficiency and the speed of the access of the data aligned data is perfectly accessible using fetching cycle of the processor from the addresses where the data are stored , it doesn't mean that with out alignment processor doesn't work it only meant for the speed access of the memory , for an integer data type, this 4-byte alignment is performed by the compiler for more efficient access to data using the processor. (on a 32-bit system)

In the case of char, only one byte is needed, just needed for the data, so there is no need for alignment as each byte is available ( in RAM there are pages and each page size is 1 byte ), but for integers we need 4 bytes, and there are no 4 bytes, or there is nothing causing access to 4 bytes at a time, so the compiler makes an alignment rule, according to which the whole data is in the correct addresses.

and with which it will be faster to access data.

0

pradipta Oct 12 '12 at 15:27

source share

Luchian grigore · Accepted Answer · 2012-10-12T14:48:51+0000

Padding is added in the second case, because on your computer the int aligned to 4 bytes. Therefore, it should be located at an address that is divisible by 4.

 0x04 0x05 0x06 0x07 0x08 0x09 0x0A 0x0B abbbb

If no padding has been added, the int member starts at address 0x05 , which is incorrect. With 3 padding bytes added:

 0x04 0x05 0x06 0x07 0x08 0x09 0x0A 0x0B a | padding | bbbb

Now int is at 0x08 , which is ok.

Why add additions for several elements of these structures, and not for individual members?

More articles: