When exactly is the incremental postfix operator computed in a complex expression?

Say I have an expression like this

short v = ( ( p[ i++ ] & 0xFF ) << 4 | ( p[ i ] & 0xF0000000 ) >> 28; 

with p is a pointer to a dynamically allocated array of 32-bit integers.

When exactly will i increase? I noticed that the above code provides a different value for v than the following code:

 short v = ( p[ i++ ] & 0xFF) << 4; v |= ( p[ i ] & 0xF0000000 ) >> 28; 

My best guess for this behavior is that i does not increase until the right side of the above | .

Any insight would be appreciated!

Thanks in advance,

\ Bjorn

+6
c ++ c operators
source share
5 answers

The problem is the evaluation order:
The C ++ standard does not specify how to evaluate subexpressions. This is done so that the compiler is as aggressive as possible during optimization.

Let's break it:

  a1 a2 v = ( ( p[ i++ ] & 0xFF ) << 4 | ( p[ i ] & 0xF0000000 ) >> 28; ----- (1) a1 = p[i] (2) i = i + 1 (i++) after (1) (3) a2 = p[i] (4) t3 = a1 & 0xFF after (1) (5) t4 = a2 & 0xF0000000 after (3) (6) t5 = t3 << 4 after (4) (7) t6 = t4 >> 28 after (5) (8) t7 = t5 | t6 after (6) and (7) (9) v = t7 after (8) 

Now the compiler is free to reorder, thus, auxiliary expressions if the above “after” positions are not violated. Thus, one quick optimization is moving to 3 slots and then deleting the common expressions (1) and (3) (now next to each other) are the same, and therefore we can eliminate (3)

But the compiler does not need to do optimization (and, probably, is better than me in this and has other tricks up its sleeve). But you can see how the value of (a1) will always be what you expect, but the value of (a2) will depend on what order the compiler decides to do other subexpressions.

The only guarantees that you have are that the compiler cannot move subexpressions through a point in the sequence. Your most common point in the sequence is ';' (end of statement). There are others, but I would avoid using this knowledge, since most people do not know the compiler very well. If you write code that uses tricks with sequence points, then someone can rearrange the code to look more readable, and now your trick has just turned into undefined be-behavior.

 short v = ( p[ i++ ] & 0xFF) << 4; v |= ( p[ i ] & 0xF0000000 ) >> 28; ----- (1) a1 = p[i] (2) i = i + 1 (i++) after (1) (4) t3 = a1 & 0xFF after (1) (6) t5 = t3 << 4 after (4) (A) v = t5 after (6) ------ Sequence Point (3) a2 = p[i] (5) t4 = a2 & 0xF0000000 after (3) (7) t6 = t4 >> 28 after (5) (8) t7 = v | t6 after (7) (9) v = t7 after (8) 

Everything is clearly defined here, since the entry in the self is performed on the spot and is not reread in the same expression.

A simple rule. do not use ++ or - operators inside a larger expression. Your code looks just as easy to read:

 ++i; // prefer pre-increment (it makes no difference here, but is a useful habit) v = ( ( p[ i ] & 0xFF ) << 4 | ( p[ i ] & 0xF0000000 ) >> 28; 

See this article for a detailed explanation of the evaluation procedure:
What are all the common undefined behaviors that a C ++ programmer should know about?

+13
source share

i increases for a while to the next point in the sequence. The only point in the sequence in the expression you specify is at the end of the instruction - therefore, “sometime before the end of the statement” is the answer in this case.

That is why you should not change the value of lvalue and read its value without an intermediate point in the sequence - the result is undefined.

&, ||, the comma and? the operators enter the points of the sequence, as well as the end of the expression and the function call (the latter means that if you execute f (i ++, & i), the body of f () will see the updated value if it uses a pointer to examine i).

+13
source share

The first example is undefined behavior. You cannot read a variable several times in an expression that also changes the value of a variable. See this (among other places on the Internet).

+9
source share

Sometimes until the end of the expression.

It is undefined for reading an object, which is also modified for something other than defining a new value, since it is undefined for writing an object twice. And you can even get inconsistent value (i.e. Read something that is not an old or new value).

+3
source share

Your expression has undefined behavior, see, for example, this for sequence points in C and C ++ statements.

+1
source share

All Articles