Another alternative:
#include <atomic> static std::atomic<unsigned long long> thread_counter; unsigned long long thread_id() { thread_local unsigned long long tid = thread_counter++; return tid; }
The generated code for this g ++ function in the x86 64-bit version is simple:
_Z9thread_idv: cmp BYTE PTR fs:_ZGVZ9thread_idvE3tid@tpoff, 0 je .L2 mov rax, QWORD PTR fs:_ZZ9thread_idvE3tid@tpoff ret .L2: mov eax, 1 lock xadd QWORD PTR _ZL14thread_counter[rip], rax mov BYTE PTR fs:_ZGVZ9thread_idvE3tid@tpoff, 1 mov QWORD PTR fs:_ZZ9thread_idvE3tid@tpoff, rax ret _ZGVZ9thread_idvE3tid: .zero 8 _ZZ9thread_idvE3tid: .zero 8
Those. one branch without any synchronization, which will be correctly predicted, with the exception of the first function call. After that, just one memory access without synchronization.
6502 Sep 18 '19 at 8:18 2019-09-18 08:18
source share