In SSE, if I have a 128-bit register containing 4 floats, i.e.
A = abcd ('a','b','c','d' are floats and 'A' is a 128-bit SSE register)
and
B = efgh
then if i want
C = aebf
I can just do:
C = _mm_unpacklo_ps(A,B);
Similarly if i want
D = cgdh
I can do:
D = _mm_unpackhi_ps(A,B);
If I have an AVX register containing doubles, is it possible to do the same with a single instruction?
Depending on how these internal functions work, I know that I cannot use _mm256_unpacklo_pd() , _mm256_shuffle_pd() , _mm256_permute2f128_pd() or _mm256_blend_pd() . Is there any instruction other than the ones I can use, or do I need to use a combination of the above instructions?
source share