Can gcc / g ++ tell me when it ignores my case?

When compiling C / C ++ codes using gcc / g ++, if it ignores my case, can it tell me? For example, in this code

int main() { register int j; int k; for(k = 0; k < 1000; k++) for(j = 0; j < 32000; j++) ; return 0; } 

j will be used as a register, but in this code

 int main() { register int j; int k; for(k = 0; k < 1000; k++) for(j = 0; j < 32000; j++) ; int * a = &j; return 0; } 

j will be normal. Can I tell if the variable that I used in the register is really stored in the CPU register?

+4
source share
5 answers

It is safe to assume that GCC ignores the register keyword, except possibly in -O0 . However, this should not make any difference anyway, and if you are in such depth, you should already read the assembly code.

Here is an informative thread on this topic: http://gcc.gnu.org/ml/gcc/2010-05/msg00098.html . In the old days, register really helped compilers assign a variable to registers, but today register allocation can be done optimally, automatically, without prompts. The keyword continues to fulfill two goals in C:

  • In C, it does not allow you to accept the address of a variable. Since registers have no addresses, this restriction can help a simple C compiler. (Simple C ++ compilers do not exist.)
  • The register object cannot be declared restrict . Since restrict refers to addresses, crossing them is pointless. (C ++ doesn't have restrict yet, and either way, this rule is a little trivial.)

For C ++, the keyword is deprecated since C ++ 11 and is proposed for removal from the standard version planned for 2017.

Some compilers have used register to declare parameters to determine the function that calls the convention, with the ABI allowing mixed parameters based on stack and register. This seems inconsistent, it tends to occur with extended syntax like register("A1") , and I don't know if any such compiler is used.

+7
source

As for modern compilation and optimization methods, the register annotation makes no sense. In your second program, you take the address j , and the registers have no addresses, but the same local or static variable may well be stored in two different memory cells during its lifetime, and sometimes in memory, and sometimes in the register, or doesn’t exist at all. Indeed, the optimizing compiler will compile your nested loops as nothing, because they do not have any effects, and simply assign their final values k and j . And then omit these assignments because the remaining code does not use these values.

+7
source

You cannot get the register address in C, plus the compiler can completely ignore you; C99 Section 6.7.1 (pdf) :

An implementation can handle any register ad just like an auto declaration. However, regardless of whether address storage is actually used, the address of any part of the object declared by the storage class specifier cannot be calculated either explicitly (using the unary & operator, as described in clause 6.5.3.2) or implicitly (by converting the array name to a pointer as described in 6.3.2.1). Thus, the only statement that can be applied to the declared array with the register of the storage class specifier sizeof.

If you are not cheating on 8-bit AVR or PIC, the compiler will probably laugh at you, thinking that you know best and ignore your requests. Even on them, I thought I knew better a couple of times and found ways to trick the compiler (with some built-in asm), but my code exploded because it had to mass a bunch of other data to get around my stubbornness.

+3
source

This question, as well as some answers and several other discussions of the “register” keywords that I saw, seem to implicitly imply that all local residents are mapped either to a specific register or to a specific memory location on the stack. This was generally true until 15–25 years ago, and it is true if you turn off optimization, but this is not entirely true when standard optimization is performed. Locals now see optimizers as symbolic names that you use to describe the data stream, and not as values ​​that need to be stored in specific places.

Note: here by "locals" I mean: scalar variables, storage class auto (or "register"), which are never used as the operand "&". Compilers can sometimes break up auto structures, unions, or arrays into separate "local" variables.

To illustrate this: suppose I write this at the top of a function:

 int factor = 8; 

.. and then the only use of the factor variable is to multiply by various things:

 arr[i + factor*j] = arr[i - factor*k]; 

In this case - try, if you want - there will be no factor variable. Analysis of the code will show that factor always 8, and therefore all shifts will turn into <<3 . If you did the same in 1985, C, factor would get a place on the stack, and would be multiple, since the compilers basically worked on one statement at a time and did not remember anything about the values ​​of the variables. Then programmers are more likely to use #define factor 8 to improve the code in this situation, while maintaining an adjustable factor .

If you use -O0 (optimization off), you really get a variable for factor . This will allow you, for example, to step over the factor=8 operator, and then change factor to 11 using the debugger and continue. For this to work, the compiler cannot store anything in the registry between operators, with the exception of variables that are assigned to specific registers; and in this case it is reported to the debugger. And he cannot “know anything” about the values ​​of the variables, since the debugger can change them. In other words, you need a 1985 situation if you want to change local variables during debugging.

Modern compilers usually compile a function as follows:

(1) when a local variable is assigned more than once in a function, the compiler creates different “versions” of the variable, so that each of them is assigned only in one place. All "readings" of a variable refer to a specific version.

(2) Each of these locales is assigned a “virtual” register. Intermediate calculation results are also assigned to variables / registers; So

  a = b*c + 2*k; 

becomes something like

  t1 = b*c; t2 = 2; t3 = k*t2; a = t1 + t3; 

(3) Then the compiler performs all these operations and looks for common subexpressions, etc. Since each of the new registers is only ever written once, their correct rearrangement is preserved while maintaining correctness. I won’t even start the cycle analysis.

(4) The compiler then tries to map all of these virtual registers to actual registers in order to generate code. Since each virtual register has a limited lifetime, you can reuse the actual registers - 't1' in the above case is only required before the addition that generates "a", so it can be stored in the same register as "a". When there are not enough registers, some virtual registers can be allocated to memory - or - the value can be stored in a specific register, stored for some time in memory and loaded back into (possibly) another register later, On a machine with a loading magazine, where only calculations can be used values ​​in registers, this second strategy goes well.

From the foregoing, this should be clear: it is easy to determine that the virtual register mapped to factor matches the constant "8", so all multiplications by factor are multiplications by 8. Even if factor changed later, the "new" variable does not affect previous applications of factor .

Other meaning if you write

  vara = varb; 

.. it may or may not be if the code contains a corresponding copy. For instance,

 int *resultp= ... int acc = arr[0] + arr[1]; int acc0 = acc; // save this for later int more = func(resultp,3)+ func(resultp,-3); acc += more; // add some more stuff if( ...){ resultp = getptr(); resultp[0] = acc0; resultp[1] = acc; } 

In the above example, the two "versions" of acc (the initial and after adding "more") can be in two different registers, and "acc0" will be the same as inital "acc". Therefore, "acc0 = acc" does not require a copy of the register. Another point: "resultp" is assigned twice, and since the second assignment ignores the previous value, there are essentially two different "resultp" variables in the code, and this is easily determined by analysis.

The implication of all this: feel free to invoke complex expressions into smaller ones, using additional locales for the intermediate links, if this simplifies code execution. For this, in principle, there is a zero penalty for execution, since the optimizer still sees the same thing.

If you are interested in learning more, you can start here: http://en.wikipedia.org/wiki/Static_single_assignment_form

The purpose of this answer is (a) to give some idea of ​​how modern compilers work, and (b) to indicate what to ask the compiler, if he were so kind, to put a certain local variable in the register - it really does not make sense. Each "variable" can be seen by the optimizer as several variables, some of which can be heavily used in loops and others not. Some variables will disappear - for example, being constant; or, sometimes, a temporary variable used in a swap. Or calculations are not actually used. The compiler is equipped to use the same register for different things in different parts of the code, in accordance with what is actually best on the machine you are compiling.

The notion of a hint of a compiler regarding what variables should be in registers implies that each local variable is mapped to a register or memory location. This was true when Kernighan + Ritchie developed the C language, but is no longer true.

Regarding the restriction that you cannot accept at the address of the register variable: Obviously, there is no way to implement the address of the variable stored in the register, but you may ask - since the compiler has the right to ignore "register" - why is this rule in place? Why can't the compiler ignore case if I get the address? (as is the case with C ++).

Again, you need to revert to the old compiler. The original K + R compiler will analyze the local declaration of the variable, and then immediately decide whether to assign it to the register or not (and if so, which register). He then proceeds to compile the expressions that emit assembler for each statement, one at a time. If it later turned out that you were taking the address of the variable “register” that was assigned to the register, there was no way to handle this, since by that time the assignment was, generally speaking, irreversible. However, it was possible to create an error message and stop compiling.

On the bottom line, it looks like the “register” is essentially out of date:

  • C ++ compilers completely ignore it. Compilers
  • C ignores it, except to force a restriction on & - and maybe not ignore it in -O0 , where it can actually result in a distribution on demand. At -00, you are not concerned about the speed of the code.

So, basically there is backward compatibility, and probably on the grounds that some implementations can still use it for “hints”. I never use it - and I write real-time DSP code and spend a lot of time creating the generated code and finding ways to make it faster. There are many ways to change code to make it faster, and knowing how compilers work is very useful. This has been a long time since the last time I discovered that adding "register" should be among these ways.


Adding

I have excluded above, from my special definition of "locals", variables to which & applies (they are, of course, included in the usual meaning of the word).

Consider the code below:

 void somefunc() { int h,w; int i,j; extern int pitch; get_hw( &h,&w ); // get shape of array for( int i = 0; i < h; i++ ){ for( int j = 0; j < w; j++ ){ Arr[i*pitch + j] = generate_func(i,j); } } } 

It may look completely harmless. But if you are concerned about the speed of execution, consider this: the compiler passes the addresses h and w to get_hw , and then calls the generate_func call. Suppose that the compiler does not know anything about what is in these functions (which is a common case). The compiler should assume that the call to generate_func may change h or w . That the completely legitimate use of the pointer is passed to get_hw - you can save it somewhere and then use it later if the area containing h,w is still in the game to read or write these variables.

Thus, the compiler must store h and w in memory on the stack and cannot determine in advance how long the loop will run. Therefore, certain optimizations will be impossible, and as a result, the cycle may be less efficient (in this example, there is a function call in the inner loop, so it may not matter much, but consider the case when there is a function that is sometimes called in the inner loop, depending on some condition).

Another problem is that generate_func can change pitch , and therefore i*pitch needs to be done every time, and not just when i changes.

It can be recoded as:

 void somefunc() { int h0,w0; int h,w; int i,j; extern int pitch; int apit = pitch; get_hw( &h0,&w0 ); // get shape of array h= h0; w= w0; for( int i = 0; i < h; i++ ){ for( int j = 0; j < w; j++ ){ Arr[i*apit + j] = generate_func(i,j); } } } 

Now the apit,h,w variables are "safe" locales in the sense that I defined above, and the compiler can be sure that they will not be changed by any function calls. Assuming I don't change anything in generate_func , the code will have the same effect as before, but it may be more efficient.

Jens Gustedt suggested using the "register" keyword as a way of tagging key variables to prohibit the use of & on them, for example. by others, supporting the code (this will not affect the generated code, since the compiler can determine the absence of & without it). For my part, I always think carefully before applying & to any local scalar in a critical area of ​​the code, and in my opinion using register to force this is a little cryptic, but I see a point (unfortunately, it does not work in C ++, because the compiler just ignores the "case").

By the way, in terms of code efficiency, the best way to return a function to two values ​​is structure:

 struct hw { // this is what get_hw returns int h,w; }; void somefunc() { int h,w; int i,j; struct hw hwval = get_hw(); // get shape of array h = hwval.h; w = hwval.w; ... 

This may seem cumbersome (and cumbersome to write), but it will generate cleaner code than previous examples. The "hw structure" will actually be returned in two registers (in most modern ABIs). And because of the way the hwval structure is used, the optimizer efficiently breaks it into two “locals” hwval.h and hwval.w , and then determines that they are equivalent to h and w - so hwval will essentially disappear in the code. No pointers should be passed, no function changes other function variables with a pointer; this is exactly the same as having two distinct scalar return values. This is much easier to do now in C ++ 11 - with std::tie and std::tuple , you can use this method with less detail (and without having to write a structure definition).

+1
source

The second example is not valid in C. Thus, you well understand that the register keyword changes something (in C). It is for this purpose that it is forbidden to accept the address of a variable. So just do not use your name "register" orally, this is the wrong name, but stick to its definition.

The fact that C ++ seems to be ignoring register is good that they should have a reason for this, but it seems sad to me again to find one of these subtle differences where valid code for one is not valid for the other.

0
source

All Articles