Byte-pixel processing using SSE / SSE2 properties in C

Question

Byte-pixel processing using SSE / SSE2 properties in C

I am programming for the cross-platform C library to do various things for webcam images. All operations are pixels and are very parallelizable - for example, they apply bit masks, multiplying color values by constants, etc. Therefore, I think I can get performance using the built-in SSE / SSE2 features.

However, I have a problem with the data format. My webcam library gives me webcam frames as a pointer (void *) to a buffer containing 24- or 32-bit byte pixels in ABGR or BGR format. I passed them to char *, so ptr ++ etc. Behaves correctly. However, all SSE / SSE2 operations expect either four integers or four floats in the __m128 or __m64 data types. If I do this (if I read the color values from the buffer into the characters r, g and b):

float pixel [] = {(float) r, (float) g, {float) b, 0.0f};

then load another floating point array, full constants

constants float [] = {0.299, 0.587, 0.114, 0.0f};

discard both floating-point pointers to __m128 and use __mm_mul_ps to execute r * 0.299, g * 0.587, etc. etc., there is no overall performance gain, because all shuffled things take so long!

Does anyone have any suggestions on how to quickly and efficiently load these byte pixel values into SSE registers so that I can get a performance boost from working with them as such?

+5

optimization c image-processing webcam

Ben englert Dec 22 '09 at 0:34

source share

3 answers

, , .

, 50 ... , FP, , 4 , , 1 15 , .

( ), MMX, , .

+1

fortran 22 . '09 15:52

-, , ( , void*) - .

-, SSE2, , - - ( , ).

, - unsigned char SSE2 ( , R, G B 0 255), , , .

But if you want to make it cross-platform, I suppose using intrinsics will be cleaner.

Good luck

0

Jacob Dec 22 '09 at 15:19

source share

Drew Dormann · Accepted Answer · 2009-12-22T05:53:49+0000

If you want to use MMX ...

MMX gives you a bunch of 64-bit registers that can treat each register as 8, 8-bit values.

Like the 8-bit values you work with.

.

Byte-pixel processing using SSE / SSE2 properties in C

More articles: