In the Intel Intrisics Guide for most instructions, it also matters for both latency and bandwidth. Example:
__ m128i _mm_min_epi32
Performance Architecture Latency Throughput Haswell 1 0.5 Ivy Bridge 1 0.5 Sandy Bridge 1 0.5 Westmere 1 1 Nehalem 1 1
What exactly do these numbers mean? I assume slower latency means that it takes more time to execute a command, but bandwidths of 1 for Nehalem and 0.5 for Ivy mean that the team runs faster on Nehalem?
c ++ sse simd
Alexandros
source share