XE8 compiles
var a: integer; AtomicIncrement(a);
to
3e: 2201 movs r2, #1 40: 900c str r0, [sp, #48] ; 0x30 42: 910b str r1, [sp, #44] ; 0x2c 44: 920a str r2, [sp, #40] ; 0x28 46: 980b ldr r0, [sp, #44] ; 0x2c 48: e850 1f00 ldrex r1, [r0] 4c: 9a0a ldr r2, [sp, #40] ; 0x28 4e: 4411 add r1, r2 50: e840 1300 strex r3, r1, [r0] 54: 2b00 cmp r3, #0 56: d1f6 bne.n 46 <_NativeMain+0x46>
So, atomicity is implemented using ldrex / strex .
If I correctly interpret community.arm.com information, the required alignment is DWORD-aligned for 4-byte operations (ldrd / strd) and QWORD-aligned for 8-byte operations.
Other atomic functions are implemented in a similar way, so the same requirements should apply.
AtomicDecrement(a); 68: 980f ldr r0, [sp, #60] ; 0x3c 6a: e850 1f00 ldrex r1, [r0] 6e: 9a0e ldr r2, [sp, #56] ; 0x38 70: 1a89 subs r1, r1, r2 72: e840 1300 strex r3, r1, [r0] 76: 2b00 cmp r3, #0 78: d1f6 bne.n 68 <_NativeMain+0x68> AtomicExchange(a,b); 82: 990f ldr r1, [sp, #60] ; 0x3c 84: 6008 str r0, [r1, #0] 86: 4873 ldr r0, [pc, #460] ; (254 <_NativeMain+0x254>) 88: 9a10 ldr r2, [sp, #64] ; 0x40 8a: 5880 ldr r0, [r0, r2] 8c: 6800 ldr r0, [r0, #0] 8e: f3bf 8f5b dmb ish 92: 900d str r0, [sp, #52] ; 0x34 94: 980f ldr r0, [sp, #60] ; 0x3c 96: e850 1f00 ldrex r1, [r0] 9a: 9b0d ldr r3, [sp, #52] ; 0x34 9c: e840 3200 strex r2, r3, [r0] a0: 2a00 cmp r2, #0 a2: 910c str r1, [sp, #48] ; 0x30 a4: d1f6 bne.n 94 <_NativeMain+0x94> AtomicCmpExchange(a, 42, 17); ae: 990f ldr r1, [sp, #60] ; 0x3c b0: 6008 str r0, [r1, #0] b2: f3bf 8f5b dmb ish b6: 202a movs r0, #42 ; 0x2a b8: 2211 movs r2, #17 ba: 900b str r0, [sp, #44] ; 0x2c bc: 920a str r2, [sp, #40] ; 0x28 be: 980f ldr r0, [sp, #60] ; 0x3c c0: e850 1f00 ldrex r1, [r0] c4: 9a0a ldr r2, [sp, #40] ; 0x28 c6: 4291 cmp r1, r2 c8: d105 bne.n d6 <_NativeMain+0xd6> ca: 990b ldr r1, [sp, #44] ; 0x2c cc: 9a0f ldr r2, [sp, #60] ; 0x3c ce: e842 1000 strex r0, r1, [r2] d2: 2800 cmp r0, #0 d4: d1f3 bne.n be <_NativeMain+0xbe>