Equivalent SSE unpacklo_ps / unpackhi_ps to AVX (for paired)

Question

Equivalent SSE unpacklo_ps / unpackhi_ps to AVX (for paired)

In SSE, if I have a 128-bit register containing 4 floats, i.e.

A = abcd ('a','b','c','d' are floats and 'A' is a 128-bit SSE register)

and

 B = efgh

then if i want

 C = aebf

I can just do:

 C = _mm_unpacklo_ps(A,B);

Similarly if i want

 D = cgdh

I can do:

 D = _mm_unpackhi_ps(A,B);

If I have an AVX register containing doubles, is it possible to do the same with a single instruction?

Depending on how these internal functions work, I know that I cannot use _mm256_unpacklo_pd() , _mm256_shuffle_pd() , _mm256_permute2f128_pd() or _mm256_blend_pd() . Is there any instruction other than the ones I can use, or do I need to use a combination of the above instructions?

+6

c sse avx

user1715122 Nov 29 '12 at 5:25

source share

1 answer

user1715122 · Accepted Answer · 2012-11-29T06:49:16+0000

One way that I can think of is as follows:

 A1 = _mm256_unpacklo_pd(A,B); A2 = _mm256_unpackhi_pd(A,B); C = _mm256_permute2f128_pd(A1,A2,0x20); D = _mm256_permute2f128_pd(A1,A2,0x31);

If anyone has a better solution, please write below.

Equivalent SSE unpacklo_ps / unpackhi_ps to AVX (for paired)

More articles: