How does the compiler automatically calculate covariance and contravariance?

Note that this is a question about the internal components of compilers.

I just read [1] that when introducing variance for generic types, the C # team thought if they should automatically calculate whether the type is the same or contravariant. Of course, now this is a story, but nevertheless I wonder how this can be done?

It accepts all methods (excluding constructors) and checks if the in or out type is sufficient?

[1] Jeffrey Richter, CLR via C #, 4th Edition, p. 281.

+5
source share
2 answers

The link in the now deleted answer to my article, which explains the exact rules for determining validity, does not answer your question. The link you are really looking for is my article on why the C # compiler team rejected an attempt to calculate the variance without any syntax that is here:

http://blogs.msdn.com/b/ericlippert/archive/2007/10/29/covariance-and-contravariance-in-c-part-seven-why-do-we-need-a-syntax-at- all.aspx

In short, the reasons for rejecting such a function:

  • Function requires full program analysis. This is not only expensive, it means that small changes of the same type can lead to an unexpected change in the choices of many distant types.
  • Deviation is what you want to create for the type; this statement is about how you expect the type to be used by its users not only today, but forever. This expectation should be encoded in the program text.
  • There are many cases where it is very difficult to calculate the user's intention, and then what do you do? You have to solve this by requiring syntax, and why not just require it all the time? For instance:

 interface I<V, W> { I<V, W> M(I<W, V> x); } 

As an exercise, figure out what all possible valid annotation variations are on V and W. Now, how should the compiler do the same calculation? What algorithm did you use? And secondly, given that this is ambiguous, how would you decide to resolve ambiguity?

Now I note that this answer still does not answer your question. You asked how this can be done, and all that I gave you was the reason why we should not try to do it. There are many ways to do this.

For example, take each generic type in the program and each type parameter of these common types. Suppose there are hundreds of them. Then there are only three hundred possible combinations of invariants, inside and outside for each; try all of them, see which ones work, and then get a ranking function that picks the winners. Of course, the problem is that it takes longer to launch than the age of the universe.

Now we can apply the smart cropping algorithm to say "any choice where T is located and, as you know, is also used in the output position is invalid", so do not check any of these cases. Now we have a situation where we have hundreds of such predicates that should be applied to determine what the applicable set of variance values ​​is. As I noted in the above example, it’s quite difficult to determine if something is really in the input or output position. So this is probably not a starter either.

Ah, but this idea implies that predicting an algebraic system is a potentially good technique. We could create an engine that generates predicates and then apply a complex SMT solver to it. It would have bad cases that would require gazillions of computation, but modern SMT solvers are pretty good in typical cases.

But all this is true, there is too much work for a function that practically does not matter to the user.

+8
source

In the example:

 interface I<T, S, R> { SM(T t, R r); TN(); } 

you can use the current C # compiler to see if it is allowed to put out (covariance marker) in front of T , S and R respectively, and the same for in (contravariant marker). Since T used both as the type of the parameter (the first parameter of method M ) and as the return type (of method N ), it cannot have either out or in (the current C # compiler can say it complains if you try any of them) . For S it is used as a return type, so it cannot have in (the current C # compiler knows). And for R it is used as a parameter type, so it cannot have out (the current C # compiler knows).

C # designers decided to let the programmer choose whether he wants a generic variance or not. Thus, in this example, there are four legitimate ways to write an interface I<,,> with difference markers:

 // 1 interface I<T, S, R> { SM(T t, R r); TN(); } // 2 interface I<T, out S, R> { SM(T t, R r); TN(); } // 3 interface I<T, S, in R> { SM(T t, R r); TN(); } // 4 interface I<T, out S, in R> { SM(T t, R r); TN(); } 

Another option that designers have always had is to make this type of interface covariant in S and contravariant in R , giving the programmer the ability to "disable" it. In this case, each type parameter will automatically receive the β€œbest” total variance. In this context, the out and in keywords are not needed.

Similarly for generic delegate types.

0
source

All Articles