Summing up the comments in response:
You probably got code that uses memory span because you did not enable optimization.
The only difference in the state of the machine is that the second version leaves a copy on the stack. (But
movd %edx, %xmm0
movl %edx, (%esp)
. , uops ( Intel. AMD Bulldozer/Steamroller 10/5 movd (x)mm, r32/r64. 1 Intel.)
:
/ , AMD. , , , .
( pdf)