Warren has a whole chapter on counting bits, including Conting 1-bits.
The problem can be solved separately and humbly, i.e. summation of 32 bits is decided as summation of 2 16-bit numbers, etc. This means that we simply add the number of units to two n bit fields together in one 2n field.
Example: 10110010 01|10|00|01 0011|0001 00000100
The code for this looks something like this:
x = (x & 0x55555555) + ((x >> 1) & 0x55555555); x = (x & 0x33333333) + ((x >> 2) & 0x33333333); x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f); x = (x & 0x00ff00ff) + ((x >> 8) & 0x00ff00ff); x = (x & 0x0000ffff) + ((x >> 16) & 0x0000ffff);
We use ((x â 1) and 0x55555555), and not (x and 0xAAAAAAAAA) â 1 just because we want to avoid generating two large constants in the register. If you look at this, you will see that the latter is completely useless, while others can also be omitted if there is no danger that the amount will be transferred. Therefore, if we simplify the code, we get the following:
int pop(unsigned x) { x = x - ((x >> 1) & 0x55555555); x = (x & 0x33333333) + ((x >> 2) & 0x33333333); x = (x + (x >> 4)) & 0x0f0f0f0f; x = x + (x >> 8); x = x + (x >> 16); return x & 0x0000003f; }
This will be 21 branch-free instructions on a regular RISC machine. Depending on how many bits are set on average, it may be faster or slower than the kerrigan loop, although it probably also depends on the processor used.
Voo
source share