How to handle 24-bit three-channel color image with SSE2 / SSE3 / SSE4?

Question

How to handle 24-bit three-channel color image with SSE2 / SSE3 / SSE4?

I just started using SS2-optimized image processing, but I have no idea for 3-channel 24-bit color images. My pixel data organized by BGR BGR BGR ..., unsigned char 8-bi, so if I want to implement Color2Gray with a C / C ++ instruction for SSE2 / SSE3 / SSE4, how would I do it? Do I need to align (4/8/16) for my pix data? I read the article: http://supercomputingblog.com/windows/image-processing-with-sse/ But this is ARGB 4-channel 32-bit color, it accurately processes 4 color pixel data each time. Thanks!

//Assume the original pixel: unsigned char* pDataColor=(unsigned char*)malloc(src.width*src.height*3);//3 //init pDataColor every pix val // The dst pixel: unsigned char* pDataGray=(unsigned char*)malloc(src.width*src.height*1);//1

// RGB-> Gray: Y = 0.212671 * R + 0.715160 * G + 0.072169 * B

+2

optimization image-processing opencv instructions sse2

user2163635 Mar 13 '13 at 3:55

source share

2 answers

Marat dukhan · Answer 1 · 2013-03-13T04:58:42+0000

I have slides on the de-placement of 24-bit RGB pixels that explain how to do this with SSE2 and SSSE3.

Kun ling · Answer 2 · 2013-03-13T04:29:32+0000

Here are some answers to your question:

How to use SSE2 functions for C / C ++. These links may be helpful.
- Optimization of image processing algorithms: a case study
- Speeding up some SSE2 built-in color conversion features
- Link to SSE built-in functions
For alignment: Yes, 16-byte alignment is required. When accessing memory using the built-in SSE2 functions (C / C ++ functions for SSE2 / SSE3 / SSE4), you must ensure that the memory address is equal to 16-byte alignment. If you use MSVC, you will need to use declspec (align (16)) or using GCC, this will be the __ ((aligned (16))) attribute .
- The reason you need to align can be found here: Why is there alignment of instructions / data?
I'm not a specialist in image processing for 3-channel RGB conversion, so I can’t give any advice. There are also some open source image processing libraries that may already contain the code you need.

How to handle 24-bit three-channel color image with SSE2 / SSE3 / SSE4?

More articles: