Search for N adjacent zero bits in an integer to the left of the MSB position of another integer

Question

Search for N adjacent zero bits in an integer to the left of the MSB position of another integer

The problem is this: if the integer val1 find the position of the most significant bit set (the most significant bit), then, taking into account the second integer val2 , find the adjacent area of un-created bits to the left of the position obtained from the first integer. width indicates the minimum number of unset bits that should be found in contact (i.e. width zeros without units inside them).

Here is the C code for my solution:

 #include <limits.h> /* for CHAR_BIT - number of bits in a char */ typedef unsigned int t; unsigned const t_bits = sizeof(t) * CHAR_BIT; _Bool test_fit_within_left_of_msb( unsigned width, t val1, /* integer to find MSB of */ t val2, /* integer to find width zero bits in */ unsigned* offset_result) { unsigned offbit = 0; /* 0 starts at high bit */ unsigned msb = 0; t mask; tb; while(val1 >>= 1) /* find MSB! */ ++msb; while(offbit + width < t_bits - msb) { /* mask width bits starting at offbit */ mask = (((t)1 << width) - 1) << (t_bits - width - offbit); b = val2 & mask; if (!b) /* result! no bits set, we can use this */ { *offset_result = offbit; return true; } if (offbit++) /* this conditional bothers me! */ b <<= offbit - 1; while(b <<= 1) offbit++; /* increment offbit past all bits set */ } return false; /* no region of width zero bits found, bummer. */ }

In addition to faster ways to find the first integer MSB, the commented offbit test for zero seems a little extraneous, but you must skip the highest bit of type t if it is set. An unconditional left shift of b by offbit - 1 bit will lead to an infinite loop, and the mask will never go past 1 in the upper bit of val2 (otherwise, if the high bit is zero without problems).

I also implemented similar algorithms, but working to the right of the first number MSB, so they do not require this seemingly additional condition.

How can I get rid of this additional condition, or even if there are more optimal solutions?

Edit: Some background is not strictly required. The result of the offset is the number of bits from the high bit, and not from the low order, as you might expect. This will be part of a broader algorithm that scans a 2D array for a 2D region of zero bits. Here, for testing, the algorithm has been simplified. val1 represents the first integer that does not have all the bits found in the row of the 2D array. From this, the 2D version scans, which means val2 .

Here, some results show success and failure:

 t_bits:32 t_high: 10000000000000000000000000000000 ( 2147483648 ) --------- ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000010000000 ( 128 ) val2: 01000001000100000000100100001001 ( 1091569929 ) msb: 7 offbit:0 + width: 8 = 8 mask: 11111111000000000000000000000000 ( 4278190080 ) b: 01000001000000000000000000000000 ( 1090519040 ) offbit:8 + width: 8 = 16 mask: 00000000111111110000000000000000 ( 16711680 ) b: 00000000000100000000000000000000 ( 1048576 ) offbit:12 + width: 8 = 20 mask: 00000000000011111111000000000000 ( 1044480 ) b: 00000000000000000000000000000000 ( 0 ) offbit:12 iters:10 ***** found room for width:8 at offset: 12 ***** ----------------------------------- *** fit within left of msb test *** ----------------------------------- val1: 00000000000000000000000001000000 ( 64 ) val2: 00010000000000001000010001000001 ( 268469313 ) msb: 6 offbit:0 + width: 13 = 13 mask: 11111111111110000000000000000000 ( 4294443008 ) b: 00010000000000000000000000000000 ( 268435456 ) offbit:4 + width: 13 = 17 mask: 00001111111111111000000000000000 ( 268402688 ) b: 00000000000000001000000000000000 ( 32768 ) ***** mask: 00001111111111111000000000000000 ( 268402688 ) offbit:17 iters:15 ***** no room found for width:13 *****

(iters is the iteration counter of the inner while loop, b is the result of val2 & mask )

+7

c bit-manipulation

James morris May 07, '10 at 21:52

source share

5 answers

This http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious has several ways to calculate the unsigned integer logarithm of the base 2 unsigned integer (which is also the highest bit position).

I think this is part of what you want. I suspect that if I really knew what you wanted, I could offer a better way to calculate it, or something that served the same purpose.

+1

nategoose May 07, '10 at 23:04

source share

count_leading_zero_bits is often one instruction for which the compiler will provide an inline function. otherwise put it in a loop.

count_trailing_zero_bits can use count_leading_zero_bits (x & -x) or search for debruijn if the first is a loop.

for simplicity, I assume 32-bit values.

 int offset_of_zero_bits_over_msb_of_other_value( unsigned width , unsigned value , unsigned other ) { int count = 0; int offset = -1; int last = 1; int lz = count_leading_zero_bits( other ); other |= ((1<<(32-lz2))-1); // set all bits below msb if ( value & ~other ) { value |= other; // set all bits below msb of other value = ~value; // invert so zeros are ones while ( value && count < width ) { count += 1; // the widest run of zeros last = value; // for counting trailing zeros value &= value >> 1; // clear leading ones from groups } offset = count_trailing_zero_bits( last ); } else { count = lz2; offset = 32 - lz2; } return ( count < width ) ? -1 : offset; }

The idea of the code is as follows:

  val1: 00000000000000000000000010000000 ( 128 ) val2: 01000001000100000000100100001001 ( 1091569929 ) lz1: 24 lz2: 1 val2: 01000001000100000000100011111111 // |= ((1<<(32-lz1))-1); val2: 10111110111011111111011100000000 // = ~val2 val2: 00011110011001111111001100000000 // &= val2>>1 , count = 1 val2: 00001110001000111111000100000000 // &= val2>>1 , count = 2 val2: 00000110000000011111000000000000 // &= val2>>1 , count = 3 val2: 00000010000000001111000000000000 // &= val2>>1 , count = 4 val2: 00000000000000000111000000000000 // &= val2>>1 , count = 5 val2: 00000000000000000011000000000000 // &= val2>>1 , count = 6 val2: 00000000000000000001000000000000 // &= val2>>1 , count = 7 val2: 00000000000000000000000000000000 // &= val2>>1 , count = 8

Thus, at each step, all ranges of zeros, now alone, are hidden on the right. When the value is zero, the number of steps taken is the width of the widest run. At any point, counting the number of trailing zeros will give an offset to the nearest range of at least count zeros.

If at any point the counter exceeds the width, you can stop the iteration. The maximum number of iterations is the width, not the size of the word. You can do this O (log n) width, because you can double the amount of shift in each iteration until you exceed the width.

Here is a DeBruijn search for counting trailing zero bits for 32-bit values.

 static const int MultiplyDeBruijnBitPosition[32] = { 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9 }; r = MultiplyDeBruijnBitPosition[((uint32_t)((v & -v) * 0x077CB531U)) >> 27];

I noticed that in both examples val1 has only one bit. If so, you can use the DeBruijn trick to search for MSB.

+1

drawnonward May 08 '10 at 4:23

source share

Here is my new and improved algorithm:

 int test_fit_within_left_of_msb( unsigned width, unsigned val1, unsigned val2 ) { int offset = 32; int msb = 0; unsigned mask; unsigned b; msb = 32 - __builtin_clz(val1); /* GCC builtin to count Leading Zeros */ while(offset - width > msb) { mask = (((unsigned)1 << width) - 1) << (offset - width); b = val2 & mask; if (!b) return 32 - offset; offset = __builtin_ctz(b); /* GCC builtin to Count Trailing Zeros */ } return -1; }

This code has many improvements over my original implementation. First of all, removing the inner while by simply counting the final zero bits. Secondly, I also made the algorithm work with an offset that uses the natural bit position values and thus removed some of the addition and subtraction operations used by my original until the return statement was successful. You could pick up the choice by subtracting the offset from 32.

The important point here in the code is the algorithm - I understand that there are portability considerations and assumptions about types and sizes. Looking back at a page where width 8 can be found at position 12, done in 10 iterations, the new alogirthm does the same thing in 2 iterations of the loop.

I used the GCC built-in functions for convenience here, the MultiplyDeBruijnBitPosition code that is provided by drawonward (from: http://graphics.stanford.edu/~seander/bithacks.html#ZerosOnRightMultLookup ) to replace __builtin_ctz, while __bultin_clz can be replaced by one of integer base 2 logarithm codes from the same page.

~~One of the problems is that the data (with small bits) that I used to check this makes this algorithm work better, it may not look so good at integers with more densely set bits.~~ (Wrong - counting trailing zeros, he avoids this bad case).

0

James morris May 09 '10 at 14:09

source share

1 (fast) method is to use pre-computed LOOKUP (LUT) tables for each 8-bit byte:

PosOfFirst1, PosOfLast1, PosOfFirst0, PosOfLast0 - all arrays with 256 bytes

Pre-calculate the tables using: (soz for bad, pascalish pseudocode)

PosOfLast1:

 FOR EACH ByteVal (0..255): if byteVal>127 return 8 elseif byteVal>63 return 7 ... elseif byteVal>0 return 1 else return 0 PosOfFirst1: c:=0; while c<8 do begin bv = byteVal and 1; if bv=1 then return c else byteval shr 1; inc (c); end;

I use simple assembler routines for these algs. PosOfFirst0 and PosOfLast0 LUT can be pre-programmed using these two tables as well as TRAILING and LEADING 0 (or 1). It is also useful to pre-subtract the minus 1 versions of these tables ...

Then you can use (for 8-bit bytes) var InputByte: Byte; FirstBit: = PosOfFirst1 [InputByte] // v.fast

For large sizes (0, 16, 24, 32 +++++), use procs and loop, which check each component 8-bit byte. You may need memory access for the LUT, but this method is even faster:

a) Can be used easily, without calling a procedure. b) scanning a 32-bit number takes 1 shift and compares from 0 per byte with 1 search (if a non-zero byte is found) instead of n (0..32) shifts, as well as comparisons ... c) if the programmed well is stopped After detecting the 1st / last 1

The LUT principle applies to "population counting" + other bit manipulation. procedures ...

Greetings, PrivateSi

FAST BETTER ?!

0

Privatesi Aug 08 '10 at 13:55

source share

James morris · Accepted Answer · 2010-05-09T16:02:57+0000

After fulfilling my previous answer, but for working with MSB law, I saw that in addition to a slight difference, the left and right versions were exactly the same. This leads to the realization that there is no real requirement that the algorithm work with MSB from some prior value at all.

So, although this answer does not meet the specifications of the question, this is the correct answer because the specification was incorrect.

 #include<stdint.h> /* returns bit position within a 32bit integer, where a region of contiguous zero bits can be found whose count is equal to or greater than width. it returns -1 on failure. */ int binary_width_fit( unsigned width, uint32_t val ) { int offset = 32; uint32_t mask; uint32_t b; while(offset >= width) { mask = (((uint32_t)1 << width) - 1) << (offset - width); b = val & mask; if (!b) return offset; offset = __builtin_ctz(b); /* GCC builtin to Count Trailing Zeros */ } return -1; }

Search for N adjacent zero bits in an integer to the left of the MSB position of another integer

More articles: