I still find this question a bit confusing, but let me see if I can rephrase the question in a form that I can answer. First, let me re-formulate the background of the question:
In C # 2.0, this code:
int x = 123; int y; if (x * 0 == 0) y = 345; Console.WriteLine(y);
considered as if you wrote
int x = 123; int y; if (true) y = 345; Console.WriteLine(y);
which, in turn, is considered as:
int x = 123; int y; y = 345; Console.WriteLine(y);
This is a legal program.
But in C # 3.0, we took an interrupt to prevent this. The compiler no longer considers the condition to be “always true”, despite the fact that you and I know that this is always the case. Now we make this an illegal program, because the compiler believes that it does not know that the body of the "if" is always executed, and therefore does not know that the local variable y is always assigned before it is used.
Why is the correct behavior of C # 3.0?
This is correct because the specification states that:
a constant expression should contain only constants. x * 0 == 0
not a constant expression because it contains a non-constant term x
.
The if
consequence is always known if the condition is a constant expression equal to true
.
Therefore, this code should not classify the consequence of the conditional operator as always achievable and therefore should not classify the local y
as defined.
Why is it desirable that a constant expression contains only constants?
We want the C # language to be understood by its users and implemented correctly by the compiler authors. The requirement that the compiler make all possible logical conclusions about the values of expressions works against these goals. It should be easy to determine if a given expression is a constant, and if so, what is its meaning. Simply put, a constant evaluation code needs to know how to do arithmetic, but it doesn't need to know the facts about arithmetic manipulations. A constant evaluator knows how to multiply 2 * 1, but one does not need to know that "1 is a multiplicative identity over integers."
Now it is possible that the author of the compiler can decide that there are areas in which they can be smart, and thereby generate more optimal code. Compiler authors are allowed to do this, but not in such a way as to alter whether the code is legal or illegal. They are allowed to do optimizations to improve the compiler’s performance when providing legal code.
How did the error occur in C # 2.0?
What happened, the compiler was written to start the arithmetic optimizer too early. The optimizer is a bit that should be smart, and it should run after the program has been determined to be legitimate. It worked before the program was determined to be legal, and therefore influenced the result.
This was a potential break change: although it brought the compiler into compliance with the specification, it also potentially turned the working code into an error code. What prompted the change?
LINQ functions and, in particular, expression trees. If you said something like:
(int x)=>x * 0 == 0
and converted to an expression tree, you expect to generate an expression tree for
(int x)=>true
? Probably no! You probably expected it to create an expression tree for "multiply x by zero and compare the result with zero." Expression trees must preserve the logical structure of the expression in the body.
When I wrote the code for the expression tree, it’s not yet clear whether the project committee will decide
()=>2 + 3
going to generate an expression tree for "add two-three" or an expression tree for "five". We determined the latter - the constants add up before generating expression trees, but arithmetic should not be run through the optimizer before generating expression trees.
So, now consider the dependencies we just declared:
- Arithmetic optimization should occur before codegen.
- Reorganization of the expression tree must occur before arithmetic optimization
- Constant folding should occur before overwriting the expression tree
- Constant coagulation must occur before flow analysis.
- Flow analysis must occur before the expression tree is transformed (because we need to know if the expression tree uses uninitialized local)
We must find an order to carry out all this work, in that it complies with all these dependencies. The compiler in C # 2.0 made them in the following order:
- constant folding and arithmetic optimization at the same time
- flow analysis
- Codegen
Where can I transform an expression tree? Nowhere! And, obviously, this is a buggy, because the flow analysis now takes into account the facts deduced by the arithmetic optimizer. We decided to redesign the compiler to do everything in order:
- permanent fold
- flow analysis
- expression tree conversion
- arithmetic optimization
- Codegen
This obviously requires a break.
Now I decided to keep the existing broken behavior by doing the following:
- permanent fold
- arithmetic optimization
- flow analysis
- arithmetic de-optimization
- expression tree conversion
- arithmetic optimization again
- Codegen
If the optimized arithmetic expression contains a pointer back to its non-optimized form. We decided that it was too difficult to save the error. We decided that it would be better to fix the error instead, accept the broken change, and simplify our understanding of the compiler architecture.