How to store values ​​in non-contiguous memory locations using SSE Intrinsics?

I am very new to SSE and have optimized the code section using the built-in functions. I am pleased with the operation itself, but I am looking for the best way to write the result. The results end with three variables _m128i.

What I'm trying to do is save specific bytes from the result values ​​to non-contiguous memory cells. I am doing this now:

__m128i values0,values1,values2;

/*Do stuff and store the results in values0, values1, and values2*/

y[0]        = (BYTE)_mm_extract_epi16(values0,0);
cb[2]=cb[3] = (BYTE)_mm_extract_epi16(values0,2);
y[3]        = (BYTE)_mm_extract_epi16(values0,4);
cr[4]=cr[5] = (BYTE)_mm_extract_epi16(values0,6);

cb[0]=cb[1] = (BYTE)_mm_extract_epi16(values1,0);
y[1]        = (BYTE)_mm_extract_epi16(values1,2);
cr[2]=cr[3] = (BYTE)_mm_extract_epi16(values1,4);
y[4]        = (BYTE)_mm_extract_epi16(values1,6);

cr[0]=cr[1] = (BYTE)_mm_extract_epi16(values2,0);
y[2]        = (BYTE)_mm_extract_epi16(values2,2);
cb[4]=cb[5] = (BYTE)_mm_extract_epi16(values2,4);
y[5]        = (BYTE)_mm_extract_epi16(values2,6);

Where y, cband crare byte arrays ( unsigned char). This seems wrong to me for reasons that I cannot determine. Does anyone have any suggestions for a better way?

Thank!

+5
source share
4 answers

- SSE , . , , - SIMD, - , . , 16 . , SIMD-, .

PEXTRW op (_mm_extract_epi16 intrinsic) - SSE . - opack shuffle ops (_mm_shuffle_ps ..), , MOVSS/_mm_store_ss(), .

, , SSE - , load - hit - store . , ; SSE , GPR. , , , ​​ - .

+9

SSE, , , , , . .

+2

SSE /, , , SIMD.

, , :

typedef union
{
    __m128i v;
    uint8_t a8[16];
    uint16_t a16[8];
    uint32_t a32[4];
} U128;

, SIMD .

+2

union .

union
{
    float value;
    unsigned char ch[8];
};

and then assign bytes as needed.
Play with the idea of ​​union-idea, maybe replace unsigned char ch [8] with an anonymous structure?
Perhaps you can get some more ideas from here.

0
source

All Articles