In fact, a 128-bit comparison of the two values a and b possible using SSE 4.1 with two instructions and a backup register set to zero.
In x86 build, using the deprecated 128-bit SSE:
pxor %xmm2, %xmm2
Using the built-in functions in C is preferable since they automatically benefit from the syntax of the AVX 3 operands, which actually saves a significant number of SSE register moves.
static const __m128i zero = {0}; inline bool compare128(__m128i a, __m128i b) { __m128i c = _mm_xor_si128(a, b); return _mm_testc_si128(zero, c); }
This compiles with something similar, as mentioned above, especially the bool temporary addition, and the carry flag is used directly.
hirschhornsalz
source share