SIMD Bandwidth

In the Intel Intrisics Guide for most instructions, it also matters for both latency and bandwidth. Example:

__ m128i _mm_min_epi32

Performance Architecture Latency Throughput Haswell 1 0.5 Ivy Bridge 1 0.5 Sandy Bridge 1 0.5 Westmere 1 1 Nehalem 1 1 

What exactly do these numbers mean? I assume slower latency means that it takes more time to execute a command, but bandwidths of 1 for Nehalem and 0.5 for Ivy mean that the team runs faster on Nehalem?

+8
c ++ sse simd
source share
2 answers

The β€œdelay” for a command is how many clock cycles it takes to execute one instruction (how long it takes to complete the command).

Normal bandwidth is the number of instructions per cycle, but here bandwidth is the number of the number of clock cycles for each start of an independent command - therefore, 0.5 clock cycles mean that 2 commands can be issued per cycle and the result is ready at the next cycle.

Intel docs here: https://software.intel.com/en-us/articles/measuring-instruction-latency-and-throughput

+10
source share

The following is a quote from the Intel Delay and Throughput Measuring Command page.

Delay and Bandwidth

Delay is the number of processor hours that a command requires in order for its data to be available for use by another instruction. Consequently, an instruction that has a latency of 6 hours will have its data available for another instruction that many hours after its launch is executed.

Throughput is the number of processor hours that an instruction requires to perform or perform its calculations. An instruction with a bandwidth of 2 measures will bind its execution unit for this many cycles, which impede instructions requiring execution unit execution. Only after the instruction is executed with the execution unit can enter the following command.

+4
source share

All Articles