The next set of n elements in the set using bitwise operators

Reading the book "C - Reference Manual (Fifth Edition)", I came across this part of the code (each integer in SET is represented by a bit position):

typedef unsigned int SET; #define emptyset ((SET) 0) #define first_set_of_n_elements(n) (SET)((1<<(n))-1) /* next_set_of_n_elements(s): Given a set of n elements, produce a new set of n elements. If you start with the result of first_set_of_n_elements(k)/ and then at each step apply next_set_of_n_elements to the previous result, and keep going until a set is obtained containing m as a member, you will have obtained a set representing all possible ways of choosing k things from m things. */ SET next_set_of_n_elements(SET x) { /* This code exploits many unusual properties of unsigned arithmetic. As an illustration: if x == 001011001111000, then smallest == 000000000001000 ripple == 001011010000000 new_smallest == 000000010000000 ones == 000000000000111 returned value == 001011010000111 The overall idea is that you find the rightmost contiguous group of 1-bits. Of that group, you slide the leftmost 1-bit to the left one place, and slide all the others back to the extreme right. (This code was adapted from HAKMEM.) */ SET smallest, ripple, new_smallest, ones; if (x == emptyset) return x; smallest = (x & -x); ripple = x + smallest; new_smallest = (ripple & -ripple); ones = ((new_smallest / smallest) >> 1) -1; return (ripple | ones); } 

I am lost in the calculation of "some", and this value in the calculation. Although I can understand mathematical calculation, I cannot understand why this works or how.

In a related note, the authors of the book argue that the calculation for first_set_of_n_elements "uses the properties of unsigned subtractions." How (2 ^ n) -1 to "exploit"?

+4
source share
5 answers

the smallest calculation gets the first non-0 bit of your int. How it works?
Let n be the bit length of your int. The opposite of the number x (bit b n-1 ... b 0 ) is calculated so that when you add x to -x you get 2 p . Since your integer is only n-bit, the received bit is discarded and you get 0.
Now let b ' n-1 ... b' 0 be the binary representation of -x.
Since x + (- x) should be equal to 2 n when you encounter the first bit 1 x (for example, at position i), the associated bit -x will also be set to 1 and when you add a number, you will get a carry.
To get 2 n, this carry has to be distributed across all bits to the end of the bit sequence of your int. Thus, the -x bit at each j position with i <j <n follows the following properties:

b j + b ' j + 1 = 10 (binary)

Then from the above we can conclude that: b j = NOT (b ' j ) and, therefore, b j and b' j = 0

On the other hand, bits b ' j of -x located at position j such that 0 <= j <i are controlled by the following:

b j + b ' j = 0 or 10

Since all related b j are set to 0, the only option is b ' j = 0

So the only bit that is 1 in both x and -x is 1 in position i

In your example:

x = 001011001111000
-x = 110100110001000

In this way,

0.0.1.0.1.1.0.0.1.1.1.1.0.0.0
1.1.0.1.0.0.1.1.0.0.0.0.0.0 and
\ ====================== /
0.0.0.0.0.0.0.0.0.0.0.1.0.0.0

Then the ripple rotates each adjacent "1" after position I (bit I turned on) to 0, and the first next 0 bit to 1 (due to the spread of the transfer). That's why you ripple :

r (x) = 0.0.1.0.1.1.0.1.0.0.0.0.0.0.0.0

Ones is calculated as dividing the smallest (r (x)) by the smallest (x). Since the smallest (x) is an integer with only one bit set to 1 at position i, you have:

(smallest (r (x)) / smallest (x)) โ†’ 1 = smallest (r (x)) โ†’ (i + 1)

The resulting integer also has only one bit, equal to 1, for example, at index p, so the expression substract -1 for this value will give you integer units, such as:

For each j such that 0 <= j <p,
those j = 1
For each j for which p <= j <n,
those j = 0

Finally, the return value is an integer such that:

  • The first subsequence of the 1-bit argument is 0.
  • All 0-bits before the subsequence are set to 1.
  • The first 0-bit after the subsequence is set to 1.
  • The remaining bits remain unchanged.

Then I can not explain the rest, since I did not understand the sentence:

a set containing m as a member is obtained

+2
source

First of all, this code is rather obscure and does not look like it is worth spending time thinking, it will bring only useless knowledge.

โ€œOperationโ€ is that the code is based on an implementation defined by the implementation of various arithmetic rules.

001 0110 0111 1000 is a 15-bit number. Why the author uses 15-bit numbers instead of 16, I donโ€™t know. The typo seems to remain even after 5 issues.

If we put a minus sign in front of this binary number in a system with two additions (an explanation of the two additions is here ), this will turn into 1110 1001 1000 1000 . Since the compiler will save the decimal representation of the number (5752) and translate it to its negative equivalent (-5752). (However, the actual data type will remain unsigned int , so if you try to print it, you will get garbage number 59784.)

  0001 0110 0111 1000 AND 1110 1001 1000 1000 = 0000 0000 0000 1000 

The C standard does not use two additions, so the code in this book is not portable.

+2
source

This is a bit misleading because it actually uses 2 add-ons. First, smallest calculation:

In the view with two additions for x in the comments -x there is 110100110001000 . Focus on the low-order bit of x , which is one; since the two additions are essentially 1 complement plus 1, this bit will be set to both x and -x , and no other bit position after it (along the LSB path) will have this property. This is how you get the smallest bit.

ripple quite simple and named as such because it distributes them to MSB, and smallest_ripple follows from the description above.

ones is the number we need to add to the ripple in order to continue selecting n elements, depict it below:

 ones: 11 next set: 100011 ones: 1 next set: 100101 ones: 0 next set: 100110 

Doing this will really show you all the ways to select the n bits from CHAR_BIT * sizeof(int) - 1 elements (the CHAR_BIT * sizeof(int) bit is required, because -x n-bit number in the worst case must be n + 1 bits to be submitted).

+1
source

Firstly, here is an example of the output that we can get with n = 4. The idea is that we start with "n" LSB set to "1", and then we iterate over all combinations of numbers with the same number of bits set in 1" :

  1111 10111 11011 11101 11110 100111 101011 101101 101110 (*) 110011 110101 110110 111001 111010 111100 1000111 1001011 1001101 

It works as follows. I will use the number with the star above as an example:

  101110 
  • We get the LSB set to '1', as clearly explained in other answers.

      101110 & 010011 = 000010 
  • We โ€œmoveโ€ the LSB one position to the left, adding it to the original number. If the bit immediately to the left is โ€œ0โ€, this is easy to understand, as subsequent operations will do nothing. If this left bit is โ€œ1โ€, we get a carry that will propagate to the left. The problem with this last case is that the numbers "1" will change, so we need to cancel some number "1" in order to keep a constant counter.

      101110 + 000010 = 110000 
  • To do this, we extract the LSB of the new result and dividing it by the previous LSB, we get the number of bits over which the transfer is distributed. It converts to equal "1" in the lower positions with "-1",

      010000 / 000010 = 001000 >> 1 - 1 = 000011 
  • We finally OR result of addition and unity.

     110011 
+1
source

I would say that "exploitation" is on an unsigned sign change in operation (x and -x).

0
source

All Articles