Why write a custom LINQ provider?

Question

Why write a custom LINQ provider?

What is the advantage of writing a custom LINQ provider when writing a simple class that implements IEnumerable?

For example, this quesiton shows Linq2Excel:

var book = new ExcelQueryFactory(@"C:\Users.xls"); var administrators = from x in book.Worksheet<User>() where x.Role == "Administrator" select x;

But what is the use of a “naive” implementation as IEnumerable?

+6

c # linq linq-to-sql ienumerable

Yaron naveh Dec 10 '10 at 16:26

source share

4 answers

You do not need to write a LINQ provider if you want to use LINQ-to-Objects (i.e. foreach -like) functions for your purpose, which mainly works with in-memory lists.

You need to write a LINQ provider if you want to parse a query expression tree to translate it to something else, such as SQL. The above ExcelQueryFactory seems to work with an OLEDB connection, for example. This may mean that when you request your data, you do not need to load the entire excel file into memory.

+7

herzmeister Dec 10 '10 at 16:35

source share

Overall performance. If you have some kind of index, you can make the query much faster than possible on a simple IEnumerable<T> .

Linq-To-Sql is a good example of this. Here you convert the linq statement to another for understanding by the SQL server. Thus, the server will perform filtering, ordering, ... using indexes and no need to send the entire table to the client, and then does this using linq-to-objects.

But there are simpler cases when this can be useful:

If you have a tree index according to the Time rule, then a range query, such as .Where(x=>(x.Time>=now)&&(x.Time<=tomorrow)) , can be optimized very often and does not need repeating each item in an enumerated.

+3

CodesInChaos Dec 10 '10 at 16:35

source share

LINQ will provide delayed execution as much as possible to improve performance.

IEnumurable <> and IQueryable <> will fully support various program implementations. IQueryable will give its own query by building a dynamic expression tree that provides good performance, and then IEnumurable.

http://msdn.microsoft.com/en-us/vcsharp/ff963710.aspx

if we are not sure that we can use the var keyword and dynamically initialize the most suitable type.

+1

Elangesh Dec 10 '10 at 16:51

source share

Keiths · Accepted Answer · 2010-12-10T16:49:50+0000

The goal of the Linq provider is to basically “translate” the Linq expression trees (which are built behind the scenes of the query) into the source query language of the data source. In cases where the data is already in memory, you do not need a Linq provider; Link 2 Objects are beautiful. However, if you use Linq to communicate with an external data warehouse, such as a DBMS or the cloud, this is absolutely necessary.

The basic premise of any query structure is that the data source mechanism should do as much of the work as possible and return only the data that the client needs. This is due to the fact that it is assumed that the data source knows best how to manage the data stored in it, and also because the network transport of data is relatively expensive in time and therefore should be minimized. Now, in reality, this second part "returns only the data requested by the client"; the server cannot read your software mind and know what it really needs; he can give only what he asked. Here, where the Linq smart provider completely dumps the "naive" implementation. Using the IQueryable Linq side that generates expression trees, the Linq provider can transform the expression tree into, say, an SQL statement, which the DBMS will use to return the records requested by the client in the Linq statement. A naive implementation would require ALL records using some wide SQL statement to provide the client with a list of objects in memory, and then all the actions for filtering, grouping, sorting, etc. Performed by the client.

For example, let's say you used Linq to retrieve a record from a table in a database by its primary key. The Linq provider can translate dataSource.Query<MyObject>().Where(x=>x.Id == 1234).FirstOrDefault() to "SELECT TOP 1 * from MyObjectTable WHERE Id = 1234". This returns zero or one record. The “naive” implementation will probably send the server a “SELECT * FROM MyObjectTable” request, and then use the IEnumerable side of Linq (which works in classes in memory) for filtering. In the statement, you expect to get the results 0-1 from a table with 10 million records, which one, in your opinion, will do the job faster (or even work at all, without running out of memory)?

Why write a custom LINQ provider?

More articles: