Why is this EXC_BAD_ACCESS going on for a long time and not with int?

I came across EXC_BAD_ACCESS with a piece of code that deals with data serialization. The code only fails on the device (iPhone) and not on the simulator. It also fails only for certain data types.

Here is the test code that reproduces the problem:

 template <typename T> void test_alignment() { // allocate memory and record the original address unsigned char *origin; unsigned char *tmp = (unsigned char*)malloc(sizeof(unsigned short) + sizeof(T)); origin = tmp; // push data with size of 2 bytes *((unsigned short*)tmp) = 1; tmp += sizeof(unsigned short); // attempt to push data of type T *((T*)tmp) = (T)1; // free the memory free(origin); } static void test_alignments() { test_alignment<bool>(); test_alignment<wchar_t>(); test_alignment<short>(); test_alignment<int>(); test_alignment<long>(); test_alignment<long long>(); // fails on iPhone device test_alignment<float>(); test_alignment<double>(); // fails on iPhone device test_alignment<long double>(); // fails on iPhone device test_alignment<void*>(); } 

Having guessed that this was a memory alignment problem, I decided that I wanted to fully understand the problem. From my (limited) understanding of memory alignment, when tmp gets 2 byte advancement, it becomes inconsistent for data types whose alignment exceeds 2 bytes:

  tmp += sizeof(unsigned short); 

But the test code runs fine for int and others! It fails only for long long , double and long double .

Examining the size and alignment of each data type showed that data types with failures are those that have different sizeof and __alignof :

 iPhone 4: bool sizeof = 1 alignof = 1 wchar_t sizeof = 4 alignof = 4 short int sizeof = 2 alignof = 2 int sizeof = 4 alignof = 4 long int sizeof = 4 alignof = 4 long long int sizeof = 8 alignof = 4 // 8 <> 4 float sizeof = 4 alignof = 4 double sizeof = 8 alignof = 4 // 8 <> 4 long double sizeof = 8 alignof = 4 // 8 <> 4 void* sizeof = 4 alignof = 4 iPhone Simulator on Mac OS X 10.6: bool sizeof = 1 alignof = 1 wchar_t sizeof = 4 alignof = 4 short int sizeof = 2 alignof = 2 int sizeof = 4 alignof = 4 long int sizeof = 4 alignof = 4 long long int sizeof = 8 alignof = 8 float sizeof = 4 alignof = 4 double sizeof = 8 alignof = 8 long double sizeof = 16 alignof = 16 void* sizeof = 4 alignof = 4 

(This is the result of starting the print function from "C ++ data alignment and portability" )

Can someone enlighten me what causes the error? Is the difference really the cause of EXC_BAD_ACCESS ? If so, what mechanics?

+4
source share
3 answers

This is actually very annoying, but not so unexpected for those of us who bought pre-x86 in the world :-)

The only reason that comes to mind (and this is pure speculation) is that the compiler โ€œcorrectsโ€ your code to make sure that the data types are correctly aligned and that sizeof/alignof cause problems. I seem to remember that the ARM6 architecture loosened some rules for some data types, but never got a good look at it, because it was decided to go with a different processor.

(Update: this is actually controlled by setting the register (maybe this is software), so I believe that even modern processors can still complain about inconsistencies).

The first thing I would like to do is look at the generated assembly to see if the compiler complements your short one, to align the next (actual) data type (which would be impressive) or (more likely) fill the actual data type before writing it .

Secondly, find out what the actual alignment requirements are for the Cortex A8, which, in my opinion, is the core used in the iPhone4.

Two possible solutions:

1 / You may have to superimpose each type into a char array and pass characters one at a time - this, I hope, will avoid alignment problems, but it can have an impact on performance. Using memcpy probably be better, since it would no doubt be encoded to use the underlying CPU already (for example, transmit four byte fragments, where possible, with single-byte fragments at the beginning and end).

2 / For those types of data that do not need to be set immediately after short , after that short add enough addition to make sure that they are correctly aligned. For example, something like:

 tmp += sizeof(unsigned short); tmp = (tmp + sizeof(T)) % alignof(T); 

which should push tmp to the next correctly aligned location before trying to save the value.

You will need to do the same thing as reading later (I assume that the short indicates the stored data so that you can indicate what type of data it is).


Putting the final decision from the OP in the answer for completeness (so people don't check the comments):

First, the assembly (on Xcode, Run menu > Debugger Display > Source and Disassembly ) shows that the STMIA command STMIA used to process 8 bytes of data (i.e. long long ) instead of the STR instruction.

Further, in section โ€œA3.2.1โ€œ Unallocated data access โ€in theโ€œ ARM Architecture ARMv7-A Reference Guide โ€(architecture corresponding to Cortex A8) it is said that STMIA does not support unaudited data access, and STR does (depending on certain registry settings).

So the problem was long long size and misalignment.

As for the solution, one - char -at-a-time works as a starter.

+4
source

Probably a problem with alignment of memory with ARM chips. ARM chips cannot process unrelated data and have unexpected behavior when accessing data that is not bound to certain boundaries. I have no data from my head that there are alignment rules for the iPhone ARM chip, but the best way to solve this is not to poke data using pointer tricks.

+1
source

Each ARM processor includes instructions for loading or storing one word at a specific address, as well as instructions that load or store several words at a time. Some processors may automatically convert a single nonequilibrium load / storage into a series of two or three operations, but such features do not apply to instructions that load / save several words at a time. I would expect that most operations on int will use only a single word load / store instruction [in some rare cases, for example, the compiler can, for example, that two int variables that are stored sequentially can be loaded into registers using one command but I would not expect such an optimization]. However, operations on long long usually loaded a pair of registers from consecutive memory locations and, therefore, would benefit from using one command. I did not profile the latest ARM chips, but on something like ARM7-TDMI, two consecutive LDR commands will receive three cycles each; a LDM that loads two registers will take four cycles. Even if for

ADD addresses must be preceded by ADD to calculate the address ( LDR has more addressing modes than LDM ), two teams that execute five cycles will be better than two teams that accept six.
0
source

Source: https://habr.com/ru/post/1315603/


All Articles