inline int interleave(int n) { n = ((n << 18) | (n << 9) | n) & 0007007007; // 000000111 000000111 000000111 n = ((n << 6) | (n << 3) | n) & 0444444444; // 100100100 100100100 100100100 return n; } r = interleave(r); g = interleave(g); b = interleave(b); rgb = r | (g >> 1) | (b >> 2); TempLinebuff[((i*3)+0) +2] = (rgb >> 16) & 0xFF; TempLinebuff[((i*3)+1) +2] = (rgb >> 8) & 0xFF; TempLinebuff[((i*3)+2) +2] = rgb & 0xFF;
Another way -
Suppose rByte has 8 bits with the number 12345678 . After the stripes in the final result, these R bits will look like this, hyphens are not preambular bits.
1
We will distribute a bit of 8 bytes evenly
unsigned long long r = (rByte * 0x0101010101010101ULL) * 0x8040201008040201ULL;
Now r contains bits in rByte as
1
hyphens all zeros
Explanation
........................................................12345678 (rByte) x ..............1......1......1......1......1......1......1......1 (Magic number, dots are 0s) __________________________________________________________________ ........................................................12345678 ................................................12345678.......β ........................................12345678......β........β ................................12345678.....β........β........β + ........................12345678....β........β........β........β ................12345678...β........β........β........β........β ........12345678..β........β........β........β........β........β 12345678.β........β........β........β........β........β........β __________________________________________________________________ = 1........2........3........4........5........6........7........8
To move a bit in r to their final positions, we divide r into 2 parts and get a bit in each part at their correct positions. The first part will move bits 1, 4, 5 and 8 with a magic number 0x40001040001 , and the second part will move the remaining bits with a magic number 0x01040001040 . These magic numbers can be calculated in the same way as above. Perhaps 32-bit multiplication is enough for this, but I have not tested it.
#define RBIT(n) (1ULL << (8-n)*9) #define RMASK_1458 (RBIT(1) | RBIT(4) | RBIT(5) | RBIT(8)) #define RMASK_2367 (RBIT(2) | RBIT(3) | RBIT(6) | RBIT(7)) #define BIT(n) ((1ULL << 63) >> ((n-1)*3)) #define MASK_BIT1458 (BIT(1) | BIT(4) | BIT(5) | BIT(8)) #define MASK_BIT2367 (BIT(2) | BIT(3) | BIT(6) | BIT(7)) #define MAGIC_1458 0x40001040001ULL #define MAGIC_2367 0x01040001040ULL uint64_t resultR = (((r & RMASK_1458) * MAGIC_1458) & MASK_BIT1458) | (((r & RMASK_2367) * MAGIC_2367) & MASK_BIT2367);
The bits for G and B can be computed in the same way. After that, you can simply combine the results together.
result = (resultR >> 32) | (resultG >> 33) | (resultB >> 34);
source share