This question, as well as some answers and several other discussions of the “register” keywords that I saw, seem to implicitly imply that all local residents are mapped either to a specific register or to a specific memory location on the stack. This was generally true until 15–25 years ago, and it is true if you turn off optimization, but this is not entirely true when standard optimization is performed. Locals now see optimizers as symbolic names that you use to describe the data stream, and not as values ​​that need to be stored in specific places.
Note: here by "locals" I mean: scalar variables, storage class auto (or "register"), which are never used as the operand "&". Compilers can sometimes break up auto structures, unions, or arrays into separate "local" variables.
To illustrate this: suppose I write this at the top of a function:
int factor = 8;
.. and then the only use of the factor variable is to multiply by various things:
arr[i + factor*j] = arr[i - factor*k];
In this case - try, if you want - there will be no factor variable. Analysis of the code will show that factor always 8, and therefore all shifts will turn into <<3 . If you did the same in 1985, C, factor would get a place on the stack, and would be multiple, since the compilers basically worked on one statement at a time and did not remember anything about the values ​​of the variables. Then programmers are more likely to use #define factor 8 to improve the code in this situation, while maintaining an adjustable factor .
If you use -O0 (optimization off), you really get a variable for factor . This will allow you, for example, to step over the factor=8 operator, and then change factor to 11 using the debugger and continue. For this to work, the compiler cannot store anything in the registry between operators, with the exception of variables that are assigned to specific registers; and in this case it is reported to the debugger. And he cannot “know anything” about the values ​​of the variables, since the debugger can change them. In other words, you need a 1985 situation if you want to change local variables during debugging.
Modern compilers usually compile a function as follows:
(1) when a local variable is assigned more than once in a function, the compiler creates different “versions” of the variable, so that each of them is assigned only in one place. All "readings" of a variable refer to a specific version.
(2) Each of these locales is assigned a “virtual” register. Intermediate calculation results are also assigned to variables / registers; So
a = b*c + 2*k;
becomes something like
t1 = b*c; t2 = 2; t3 = k*t2; a = t1 + t3;
(3) Then the compiler performs all these operations and looks for common subexpressions, etc. Since each of the new registers is only ever written once, their correct rearrangement is preserved while maintaining correctness. I won’t even start the cycle analysis.
(4) The compiler then tries to map all of these virtual registers to actual registers in order to generate code. Since each virtual register has a limited lifetime, you can reuse the actual registers - 't1' in the above case is only required before the addition that generates "a", so it can be stored in the same register as "a". When there are not enough registers, some virtual registers can be allocated to memory - or - the value can be stored in a specific register, stored for some time in memory and loaded back into (possibly) another register later, On a machine with a loading magazine, where only calculations can be used values ​​in registers, this second strategy goes well.
From the foregoing, this should be clear: it is easy to determine that the virtual register mapped to factor matches the constant "8", so all multiplications by factor are multiplications by 8. Even if factor changed later, the "new" variable does not affect previous applications of factor .
Other meaning if you write
vara = varb;
.. it may or may not be if the code contains a corresponding copy. For instance,
int *resultp= ... int acc = arr[0] + arr[1]; int acc0 = acc;
In the above example, the two "versions" of acc (the initial and after adding "more") can be in two different registers, and "acc0" will be the same as inital "acc". Therefore, "acc0 = acc" does not require a copy of the register. Another point: "resultp" is assigned twice, and since the second assignment ignores the previous value, there are essentially two different "resultp" variables in the code, and this is easily determined by analysis.
The implication of all this: feel free to invoke complex expressions into smaller ones, using additional locales for the intermediate links, if this simplifies code execution. For this, in principle, there is a zero penalty for execution, since the optimizer still sees the same thing.
If you are interested in learning more, you can start here: http://en.wikipedia.org/wiki/Static_single_assignment_form
The purpose of this answer is (a) to give some idea of ​​how modern compilers work, and (b) to indicate what to ask the compiler, if he were so kind, to put a certain local variable in the register - it really does not make sense. Each "variable" can be seen by the optimizer as several variables, some of which can be heavily used in loops and others not. Some variables will disappear - for example, being constant; or, sometimes, a temporary variable used in a swap. Or calculations are not actually used. The compiler is equipped to use the same register for different things in different parts of the code, in accordance with what is actually best on the machine you are compiling.
The notion of a hint of a compiler regarding what variables should be in registers implies that each local variable is mapped to a register or memory location. This was true when Kernighan + Ritchie developed the C language, but is no longer true.
Regarding the restriction that you cannot accept at the address of the register variable: Obviously, there is no way to implement the address of the variable stored in the register, but you may ask - since the compiler has the right to ignore "register" - why is this rule in place? Why can't the compiler ignore case if I get the address? (as is the case with C ++).
Again, you need to revert to the old compiler. The original K + R compiler will analyze the local declaration of the variable, and then immediately decide whether to assign it to the register or not (and if so, which register). He then proceeds to compile the expressions that emit assembler for each statement, one at a time. If it later turned out that you were taking the address of the variable “register” that was assigned to the register, there was no way to handle this, since by that time the assignment was, generally speaking, irreversible. However, it was possible to create an error message and stop compiling.
On the bottom line, it looks like the “register” is essentially out of date:
- C ++ compilers completely ignore it. Compilers
- C ignores it, except to force a restriction on
& - and maybe not ignore it in -O0 , where it can actually result in a distribution on demand. At -00, you are not concerned about the speed of the code.
So, basically there is backward compatibility, and probably on the grounds that some implementations can still use it for “hints”. I never use it - and I write real-time DSP code and spend a lot of time creating the generated code and finding ways to make it faster. There are many ways to change code to make it faster, and knowing how compilers work is very useful. This has been a long time since the last time I discovered that adding "register" should be among these ways.
Adding
I have excluded above, from my special definition of "locals", variables to which & applies (they are, of course, included in the usual meaning of the word).
Consider the code below:
void somefunc() { int h,w; int i,j; extern int pitch; get_hw( &h,&w );
It may look completely harmless. But if you are concerned about the speed of execution, consider this: the compiler passes the addresses h and w to get_hw , and then calls the generate_func call. Suppose that the compiler does not know anything about what is in these functions (which is a common case). The compiler should assume that the call to generate_func may change h or w . That the completely legitimate use of the pointer is passed to get_hw - you can save it somewhere and then use it later if the area containing h,w is still in the game to read or write these variables.
Thus, the compiler must store h and w in memory on the stack and cannot determine in advance how long the loop will run. Therefore, certain optimizations will be impossible, and as a result, the cycle may be less efficient (in this example, there is a function call in the inner loop, so it may not matter much, but consider the case when there is a function that is sometimes called in the inner loop, depending on some condition).
Another problem is that generate_func can change pitch , and therefore i*pitch needs to be done every time, and not just when i changes.
It can be recoded as:
void somefunc() { int h0,w0; int h,w; int i,j; extern int pitch; int apit = pitch; get_hw( &h0,&w0 );
Now the apit,h,w variables are "safe" locales in the sense that I defined above, and the compiler can be sure that they will not be changed by any function calls. Assuming I don't change anything in generate_func , the code will have the same effect as before, but it may be more efficient.
Jens Gustedt suggested using the "register" keyword as a way of tagging key variables to prohibit the use of & on them, for example. by others, supporting the code (this will not affect the generated code, since the compiler can determine the absence of & without it). For my part, I always think carefully before applying & to any local scalar in a critical area of ​​the code, and in my opinion using register to force this is a little cryptic, but I see a point (unfortunately, it does not work in C ++, because the compiler just ignores the "case").
By the way, in terms of code efficiency, the best way to return a function to two values ​​is structure:
struct hw { // this is what get_hw returns int h,w; }; void somefunc() { int h,w; int i,j; struct hw hwval = get_hw(); // get shape of array h = hwval.h; w = hwval.w; ...
This may seem cumbersome (and cumbersome to write), but it will generate cleaner code than previous examples. The "hw structure" will actually be returned in two registers (in most modern ABIs). And because of the way the hwval structure is used, the optimizer efficiently breaks it into two “locals” hwval.h and hwval.w , and then determines that they are equivalent to h and w - so hwval will essentially disappear in the code. No pointers should be passed, no function changes other function variables with a pointer; this is exactly the same as having two distinct scalar return values. This is much easier to do now in C ++ 11 - with std::tie and std::tuple , you can use this method with less detail (and without having to write a structure definition).