What, if any, are alignment requirements for atomic internal functions?

Atomic Operations for Mobile Purpose Delphi is built on top of the AtomicXXX family of built-in functions. The documentation says:

Since Delphi mobile compilers do not support built-in assembler, the system unit provides four atomic built-in functions that enable atomic exchange, comparison and exchange, and increase and decrease memory values.

These four functions:

Other RTL functions that provide atomic operations, for example. The methods of the static class of the TInterlocked class TInterlocked built on top of these four built-in functions.

For mobile compilers that target ARMv7, are there any alignment requirements for these four atomic properties? If so, then who are they?

The documentation does not indicate such requirements. However, the documentation is known to be inaccurate, and I am not sure that there are no stated requirements as final evidence that such requirements do not exist.

On the soft side, the XE8 documentation for built-in functions states that these atomic properties are not supported by desktop compilers. This is not true - these built-in functions are supported by desktop compilers.

+5
source share
2 answers

XE8 compiles

 var a: integer; AtomicIncrement(a); 

to

 3e: 2201 movs r2, #1 40: 900c str r0, [sp, #48] ; 0x30 42: 910b str r1, [sp, #44] ; 0x2c 44: 920a str r2, [sp, #40] ; 0x28 46: 980b ldr r0, [sp, #44] ; 0x2c 48: e850 1f00 ldrex r1, [r0] 4c: 9a0a ldr r2, [sp, #40] ; 0x28 4e: 4411 add r1, r2 50: e840 1300 strex r3, r1, [r0] 54: 2b00 cmp r3, #0 56: d1f6 bne.n 46 <_NativeMain+0x46> 

So, atomicity is implemented using ldrex / strex .

If I correctly interpret community.arm.com information, the required alignment is DWORD-aligned for 4-byte operations (ldrd / strd) and QWORD-aligned for 8-byte operations.

Other atomic functions are implemented in a similar way, so the same requirements should apply.

 AtomicDecrement(a); 68: 980f ldr r0, [sp, #60] ; 0x3c 6a: e850 1f00 ldrex r1, [r0] 6e: 9a0e ldr r2, [sp, #56] ; 0x38 70: 1a89 subs r1, r1, r2 72: e840 1300 strex r3, r1, [r0] 76: 2b00 cmp r3, #0 78: d1f6 bne.n 68 <_NativeMain+0x68> AtomicExchange(a,b); 82: 990f ldr r1, [sp, #60] ; 0x3c 84: 6008 str r0, [r1, #0] 86: 4873 ldr r0, [pc, #460] ; (254 <_NativeMain+0x254>) 88: 9a10 ldr r2, [sp, #64] ; 0x40 8a: 5880 ldr r0, [r0, r2] 8c: 6800 ldr r0, [r0, #0] 8e: f3bf 8f5b dmb ish 92: 900d str r0, [sp, #52] ; 0x34 94: 980f ldr r0, [sp, #60] ; 0x3c 96: e850 1f00 ldrex r1, [r0] 9a: 9b0d ldr r3, [sp, #52] ; 0x34 9c: e840 3200 strex r2, r3, [r0] a0: 2a00 cmp r2, #0 a2: 910c str r1, [sp, #48] ; 0x30 a4: d1f6 bne.n 94 <_NativeMain+0x94> AtomicCmpExchange(a, 42, 17); ae: 990f ldr r1, [sp, #60] ; 0x3c b0: 6008 str r0, [r1, #0] b2: f3bf 8f5b dmb ish b6: 202a movs r0, #42 ; 0x2a b8: 2211 movs r2, #17 ba: 900b str r0, [sp, #44] ; 0x2c bc: 920a str r2, [sp, #40] ; 0x28 be: 980f ldr r0, [sp, #60] ; 0x3c c0: e850 1f00 ldrex r1, [r0] c4: 9a0a ldr r2, [sp, #40] ; 0x28 c6: 4291 cmp r1, r2 c8: d105 bne.n d6 <_NativeMain+0xd6> ca: 990b ldr r1, [sp, #44] ; 0x2c cc: 9a0f ldr r2, [sp, #60] ; 0x3c ce: e842 1000 strex r0, r1, [r2] d2: 2800 cmp r0, #0 d4: d1f3 bne.n be <_NativeMain+0xbe> 
+4
source

Automation is typically implemented using LDREX and STREX (Load Exclusive / Store Exclusive). These instructions use a concept called exclusive monitors. Check out: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dht0008a/ch01s02s01.html Find "Exclusive Reservation Schedules"

Thus, your alignment requirements are implementation specific and will be addressed using an exclusive monitoring mechanism implemented on your equipment. I would advise you to take a look at the CPU / SoC documentation for the exclusive monitor section.

Eg. When internal monitors are used, and these monitors are usually implemented at the cache level (usually L2). Each cache line will have a monitor.

  • Thus, your atomic data must be contained in one cache line, alignment will follow from this requirement
  • If several atoms occupy the same cache line when one atom is in an exceptional state, all other atoms in the same cache line will be in a false exclusive state. This will result in blocking inefficiencies. If you have a problem with atom-aligned atoms, avoid this problem. Note. Multiple atoms in the same cache line will still work, but will be inefficient.
+3
source

All Articles