Layout memory structure in C

Question

Layout memory structure in C

I have a C # background. I am very new to low level language such as C.

In C #, struct memory is laid out by default by the compiler. The compiler can randomly reorder data fields or add extra bits between fields. So, I had to specify some special attribute to override this behavior for the exact layout.

AFAIK, C does not reorder or align the default struct memory structure. However, I heard there a slight exception, which is very difficult to find.

What is the behavior of C memory layout? What needs to be reordered / aligned and what not?

+73

c struct

Eonil May 01 '10 at 5:18 a.m.

source share

4 answers

It depends on the implementation, but in practice the rule (in the absence of the #pragma pack , etc.):

Members of the structure are stored in the order of their declaration. (This is required by the C99 standard, as mentioned here before.)
If necessary, an indent is added to each element of the structure to ensure proper alignment.
Each primitive type T requires sizeof(T) byte alignment.

So, given the following structure:

 struct ST { char ch1; short s; char ch2; long long ll; int i; };

ch1 with offset 0
fill byte is inserted for alignment ...
s at offset 2
ch2 at offset 4, immediately after s
3 fill bytes are inserted for alignment ...
ll at offset 8
i at offset 16, immediately after
At the end 4 bytes of padding are added, so the overall structure is a multiple of 8 bytes. I tested this on a 64-bit system: 32-bit systems can allow structures to have 4-byte alignment.

So sizeof(ST) is 24.

This can be reduced to 16 bytes by rearranging elements to avoid padding:

 struct ST { long long ll; // @ 0 int i; // @ 8 short s; // @ 12 char ch1; // @ 14 char ch2; // @ 15 } ST;

+101

dan04 May 01 '10 at 6:20

source share

You can start by reading the Wikipedia article on data structure alignment to better understand data alignment.

From the Wikipedia article :

Data alignment means placing data with a memory offset equal to a multiple of the word size, which improves system performance due to the way the processor processes the memory. To align the data, you may need to insert a few meaningless bytes between the end of the last data structure and the beginning of the next, which is filling the data structure.

From 6.54.8 Structurally packing pragmas of the GCC documentation:

For compatibility with Microsoft Windows compilers, GCC supports a set of #pragma directives that change the maximum alignment of structural elements (except for zero-width bit fields), unions, and classes that are subsequently defined. The value of n below should always be a small power of two and indicates a new alignment in bytes.
#pragma pack(n) just sets a new alignment.
#pragma pack() sets the alignment to the one that acted at the start of the compilation (see also the command line parameter -fpack-struct [=], see Code Gen options).
#pragma pack(push[,n]) pushes the current alignment setting on the internal stack, and then sets a new alignment if necessary.
#pragma pack(pop) restores the alignment setting to the value stored at the top of the internal stack (and deletes this stack entry). Note that #pragma pack([n]) does not affect this inner stack; thus, you can have #pragma pack(push) followed by several instances of #pragma pack(n) and end with one #pragma pack(pop) .
Some targets, such as i386 and powerpc, support ms_struct #pragma which represents the structure as a documented __attribute__ ((ms_struct)) .
#pragma ms_struct on includes a layout for declared structures.
#pragma ms_struct off disables layout for declared structures.
#pragma ms_struct reset returns to the default layout.

+9

jschmier May 01 '10 at 5:26 a.m.

source share

In C, the structures are laid out almost exactly as you specify in the code. Similar to C # StructLayout.Sequential.

The only difference is the alignment of the elements. This never reorders data elements in the structure, but can resize the structure by inserting “pad” bytes in the middle of the structure. The reason for this is to make sure that each member starts at the border (usually 4 or 8 bytes).

For example:

 struct mystruct { int a; short int b; char c; };

The size of this structure is usually 12 bytes (4 for each member). This is because most compilers by default make each member the same size as the largest in the structure. Thus, char will take 4 bytes instead of one. But it is very important to note that sizeof (mystruct :: c) will still be 1, but sizeof (mystruct) will be 12.

It is difficult to predict how the structure will be complemented / aligned by the compiler. Most will be by default, as I explained above, some by default will not pad / align (also sometimes called “packaged”).

The method for changing this behavior is very dependent on the compiler; there is nothing in the language indicating how this should be handled. In MSVC, you should use #pragma pack(1) to disable alignment (1 says aligns everything to 1 byte boundaries). In GCC, you should use __attribute__((packed)) in your structure definition. See the documentation for your compiler to find out what it does by default and how to change this behavior.

+3

SoapBox May 01 '10 at 5:26 a.m.

source share

Potatoswatter · Accepted Answer · 2010-05-01 05:26

In C, the compiler is allowed to dictate some alignment for each primitive type. Usually alignment is the size of the type. But it is completely implementation specific.

Fill bytes are added so that each object is correctly aligned. Reordering is not allowed.

Perhaps every remote modern compiler implements #pragma pack , which allows you to control the filling and leaves it to the programmer in accordance with ABI. (This is strictly non-standard.)

From C99 §6.7.2.1:

12 Each member of a non-bit field, the structure or association object is aligned according to the implementation corresponding to its type.
13 Inside an object is a structure, a non-bit field of members and units in which bit fields have addresses that increase the order in which they are declared. A pointer to the structure of the object, properly transformed, points to its original member (or if this member is a bit field, and then the unit in which it resides), and vice versa. Inside is an object of structure, but not a beginning.

Layout memory structure in C

More articles: