What is LINQ actually compiled?

Background

A recent conversation in the comments with another well-informed user about how LINQ is compiled is the background. At first I generalized and said that LINQ was compiled into a for loop. Although this is not true, my understanding from other stacks like this one is that the LINQ query is compiled into a lambda with a loop inside it. This is then called when the variable is enumerated for the first time (after which the results are saved). Another user said LINQ performs additional optimizations such as hashing. I could not find any supporting documentation for or against this.

I know this seems like a very obscure point, but I always felt that if I do not understand how something works completely, it will be difficult to understand why I am not using it correctly.

Question

So let's look at the following very simple example:

var productNames = from p in products where p.Id > 100 and p.Id < 5000 select p.ProductName; 

What is this statement compiled in the CLR? What kind of optimizations does LINQ use to simply write a function that manually analyzes the results? Is it just semantics or is there something more than that?

Explanation

It is clear that I am asking this question because I do not understand what the LINQ black frame looks like. Although I understand that LINQ is complex (and powerful), I'm mostly looking for a basic understanding of the CLR or the functional equivalent of the LINQ operator. There are great sites out there that help you understand how to create a LINQ statement, but very few of them seem to give any advice on how they are really compiled or run.

Side note. I absolutely read the John Skeet linq series on objects.

Side Note 2 - I should not have marked this as LINQ to SQL. I understand how ORM and micro-ORM work. In fact, this is also a question.

+6
source share
2 answers

For LINQ to Objects, this is compiled into a set of calls to static methods:

 var productNames = from p in products where p.Id > 100 and p.Id < 5000 select p.ProductName; 

becomes:

 IEnumerable<string> productNames = products .Where(p => p.Id > 100 and p.Id < 5000) .Select(p => p.ProductName); 

This uses extension methods defined in Enumerable , so it is actually compiled into:

 IEnumerable<string> productNames = Enumerable.Select( Enumerable.Where(products, p => p.Id > 100 and p.Id < 5000), p => p.ProductName ); 

Lambda expressions for processing this method are converted to methods by the compiler. Lambda in the case when it turns into a method that can be set to Func<Product, Boolean> , and select in Func<Product, String> .

For a detailed explanation, see John Skeet's blog series section : redefining LINQ into objects . It looks at the whole process, how it works, including compiler transformations (from request syntax to method calls), methods implementation methods, etc.

Note that the implementation of LINQ to Sql and IQueryable<T> is different. Expression<T> that is generated by the lambda is passed to the query provider, which, in turn, is somehow converted (to the provider, how to do this) to calls that are usually launched on the server in case of ORM.


For this method, for example:

  private static IEnumerable<string> ProductNames(IEnumerable<Product> products) { var productNames = from p in products where p.Id > 100 && p.Id < 5000 select p.ProductName; return productNames; } 

Gets the compiled following IL:

  .method private hidebysig static class [mscorlib]System.Collections.Generic.IEnumerable`1<string> ProductNames(class [mscorlib]System.Collections.Generic.IEnumerable`1<class ConsoleApplication3.Product> products) cil managed { .maxstack 3 .locals init ( [0] class [mscorlib]System.Collections.Generic.IEnumerable`1<string> enumerable, [1] class [mscorlib]System.Collections.Generic.IEnumerable`1<string> enumerable2) L_0000: nop L_0001: ldarg.0 L_0002: ldsfld class [mscorlib]System.Func`2<class ConsoleApplication3.Product, bool> ConsoleApplication3.Program::CS$<>9__CachedAnonymousMethodDelegate3 L_0007: dup L_0008: brtrue.s L_001d L_000a: pop L_000b: ldnull L_000c: ldftn bool ConsoleApplication3.Program::<ProductNames>b__2(class ConsoleApplication3.Product) L_0012: newobj instance void [mscorlib]System.Func`2<class ConsoleApplication3.Product, bool>::.ctor(object, native int) L_0017: dup L_0018: stsfld class [mscorlib]System.Func`2<class ConsoleApplication3.Product, bool> ConsoleApplication3.Program::CS$<>9__CachedAnonymousMethodDelegate3 L_001d: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0> [System.Core]System.Linq.Enumerable::Where<class ConsoleApplication3.Product>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [mscorlib]System.Func`2<!!0, bool>) L_0022: ldsfld class [mscorlib]System.Func`2<class ConsoleApplication3.Product, string> ConsoleApplication3.Program::CS$<>9__CachedAnonymousMethodDelegate5 L_0027: dup L_0028: brtrue.s L_003d L_002a: pop L_002b: ldnull L_002c: ldftn string ConsoleApplication3.Program::<ProductNames>b__4(class ConsoleApplication3.Product) L_0032: newobj instance void [mscorlib]System.Func`2<class ConsoleApplication3.Product, string>::.ctor(object, native int) L_0037: dup L_0038: stsfld class [mscorlib]System.Func`2<class ConsoleApplication3.Product, string> ConsoleApplication3.Program::CS$<>9__CachedAnonymousMethodDelegate5 L_003d: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!1> [System.Core]System.Linq.Enumerable::Select<class ConsoleApplication3.Product, string>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [mscorlib]System.Func`2<!!0, !!1>) L_0042: stloc.0 L_0043: ldloc.0 L_0044: stloc.1 L_0045: br.s L_0047 L_0047: ldloc.1 L_0048: ret } 

Note that these are regular call statements for method calls. Lambdas are converted to other methods, such as:

 [CompilerGenerated] private static bool <ProductNames>b__2(Product p) { return ((p.Id > 100) && (p.Id < 0x1388)); } 
+11
source

The query syntax is just syntactic sugar for the method syntax, it compiles efficiently with this:

 var productNames = Products().Where(p => p.Id > 100 && p.Id < 5000).Select(p => productName); 

Now what actually performs these functions depends on what kind of LINQ flavor you use, for example. Linq to Objects (which combines the handlers in memory) or Linq to SQL (which converts it to an SQL query), etc.

-1
source

All Articles