How to safely shift bits without undefined behavior?

Question

How to safely shift bits without undefined behavior?

I am writing a function that converts bits to an int / uint value, given that a bitset can have fewer bits than the type of target.

Here is the function I wrote:

template <typename T,size_t count> static T convertBitSetToNumber( const std::bitset<count>& bitset ) { T result; #define targetSize (sizeof( T )*CHAR_BIT) if ( targetSize > count ) { // if bitset is 0xF00, converting it as 0x0F00 will lose sign information (0xF00 is negative, while 0x0F00 is positive) // This is because sign bit is on the left. // then, we need to add a zero (4bits) on the right and then convert 0xF000, later, we will divide by 16 (2^4) to preserve sign and value size_t missingbits = targetSize - count; std::bitset<targetSize> extended; extended.reset(); // set all to 0 for ( size_t i = 0; i != count; ++i ) { if ( i < count ) extended[i+missingbits] = bitset[i]; } result = static_cast<T>( extended.to_ullong() ); result = result >> missingbits; return result; } else { return static_cast<T>( bitset.to_ullong() ); } }

And the "test program":

 uint16_t val1 = Base::BitsetUtl::convertBitSetToNumber<uint16_t,12>( std::bitset<12>( "100010011010" ) ); // val1 is 0x089A int16_t val2 = Base::BitsetUtl::convertBitSetToNumber<int16_t,12>( std::bitset<12>( "100010011010" ) ); // val2 is 0xF89A

Note: See the comment / exchange with Ped7g, the code above is correct and retains the bit sign and makes the conversion 12-> 16 bits correct for signed or unsigned bits. But if you are looking at how to shift 0xABC0 to 0x0ABC on a signed object, the answers may help you, so I am not deleting the question.

See the program works when using uint16 as the target type, for example:

 uint16_t val = 0x89A0; // 1000100110100000 val = val >> 4; // 0000100010011010

However, when using int16_t it fails because 0x89A0 >> 4 is 0xF89A instead of the expected 0x089A .

 int16_t val = 0x89A0; // 1000100110100000 val = val >> 4; // 1111100010011010

I don’t understand why → the operator sometimes inserts 0, and sometimes 1. And I can’t find out how to safely perform the final operation of my function ( result = result >> missingbits; at some point it should be wrong ...)

+6

c ++ std-bitset

jpo38 Sep 22 '16 at 14:16

source share

4 answers

This is called an arithmetic shift . In the case of signed types, the most significant bit is the sign bit. When you shift a negative value to the right, the upper bits are set to 1, so the result is still negative. (The result is a division by 2 ⁿ where n is the number of shifted bits, rounding to negative infinity).

To avoid this, use an unsigned type. When changing them, a logical offset is used , which sets the upper bits to 0.

Change this line:

 result = result >> missingbits;

to

 result = static_cast<T>(static_cast<uintmax_t>(result) >> missingbits);

( uintmax_t is the maximum unsigned integer type supported by the compiler)

or use std::make_unsigned , as Joachim Pieleborg wrote in his answer.

+4

alain Sep 22 '16 at 14:21

source share

As already mentioned, when your type is signed , the >> operator performs an arithmetic offset. Therefore, in addition to the solution suggested above, if you need to perform a logical offset, you can always just use mask , as shown below:

  int mask = 1 << (targetSize-missingbits-1); mask |= mask - 1; result = (result >> missingbits) & mask;

In this case, mask will give you the missingbits MSB 0 , and the remaining 1 . In your case 4 MSB will be 0 , and the rest 1 . Then, executing & will reset the first missingbits in your result , and this is what you need:

 0xF89A & 0x0FFF = 0x089A

See how the live-example works.

+1

A. Sarid Sep 22 '16 at 14:48

source share

This source code with a loop looks a bit complicated for me, I would write it like this (I mean as the second option, after I couldn’t somehow inexplicably avoid using std::bitset and templates completely, for such the simplest thing is how to set the bit size first):

 #include <bitset> #include <climits> template <typename T,size_t count> static T convertBitSetToNumber( const std::bitset<count>& bitset ) { constexpr size_t targetSize = sizeof( T )*CHAR_BIT; if (targetSize == count) return static_cast<T>(bitset.to_ullong()); if (targetSize < count) return static_cast<T>(bitset.to_ullong() >> (count - targetSize)); return static_cast<T>(bitset.to_ullong() << (targetSize - count)) >> (targetSize - count); } // Example test producing from 0x089A bitset unsigned/signed values: // 16b: 89a f89a | 8b: 89 89 | 32b: 89a fffff89a #include <iostream> int main() { const std::bitset<12> testBitset("100010011010"); std::hex(std::cout); std::cout << convertBitSetToNumber<uint16_t,12>( testBitset ) << std::endl; std::cout << convertBitSetToNumber<int16_t,12>( testBitset ) << std::endl; std::cout << (0xFF & convertBitSetToNumber<uint8_t,12>( testBitset )) << std::endl; std::cout << (0xFF & convertBitSetToNumber<int8_t,12>( testBitset )) << std::endl; std::cout << convertBitSetToNumber<uint32_t,12>( testBitset ) << std::endl; std::cout << convertBitSetToNumber<int32_t,12>( testBitset ) << std::endl; }

+1

Ped7g Sep 22 '16 at 15:53

source share

Some programmer dude · Accepted Answer · 2016-09-22T14:23:33+0000

This is because the shift is an arithmetic operation and that supports operands up to int that will perform character expansion.

those. pushing a signed 16-bit integer ( int16_t ) 0x89a0 into a 32-bit signed integer ( int ) causes the value to become 0xffff89a0 , which is the shifted value.

See this link for arithmetic operations for more information.

You must specify the variable (or value) in an unsigned integer (i.e. uint16_t in your case):

 val = static_cast<uint16_t>(val) >> 4;

If the type does not know, for example, if it is a template argument, you can use std::make_unsigned :

 val = static_cast<typename std::make_unsigned<T>::type>(val) >> 4;

How to safely shift bits without undefined behavior?

More articles: