Can you get streaming using DataReader using Linq-to-SQL?

I have been using Linq-to-SQL for quite some time and it works great. However, lately I have been experimenting with using it to pull out really large amounts of data, and I am facing some problems. (Of course, I understand that L2S may not be the best tool for this particular type of processing, but because I'm experimenting to find its limits.)

Here is a sample code:

var buf = new StringBuilder(); var dc = new DataContext(AppSettings.ConnectionString); var records = from a in dc.GetTable<MyReallyBigTable>() where a.State == "OH" select a; var i = 0; foreach (var record in records) { buf.AppendLine(record.ID.ToString()); i += 1; if (i > 3) { break; // Takes forever... } } 

As soon as I start iterating over the data, the query is executed as expected. When I go through the code, I immediately enter the loop, and that is what I was hoping for - this means that L2S seems to be using the DataReader behind the scenes instead of transferring all the data first. However, as soon as I get to break , the request will continue to execute and pull all the other entries. Here are my questions for the SO community:

1.) Is there a way to stop Linq-to-SQL from completing a really large query in the middle, as you can with DataReader ?

2.) If you are executing a large Linq-to-SQL query, is there a way to prevent the DataContext from populating with change tracking information for each returned object. Basically, instead of filling up the memory, can I make a large query with short object loops in the way you can using the DataReader methods?

I am fine if this is not functionality built into the DataContext and requires an extension of functionality with some tweaking. I just want to use the simplicity and power of Linq for large queries for nightly processing tasks, rather than relying on T-SQL for everything.

+4
source share
1 answer

1.) Is there a way to stop Linq-to-SQL from completing a really large query in the middle of the way you can with DataReader?

Not really. After the query completes, the underlying SQL statement returns a result set of matching records. The request is delayed to this point, but not during the crawl.

In your example, you can simply use records.Take(3) , but I understand that your actual logic to stop the process may be external to SQL or not easily translatable.

You can use a combined approach by building a strongly typed LINQ query and then executing it with the old-fashioned ADO.NET. The downside is that you lose the mapping to the class and must manually process the results of SqlDataReader. An example of this is shown below:

 var query = from c in Customers where c.ID < 15 select c; using (var command = dc.GetCommand(query)) { command.Connection.Open(); using (var reader = command.ExecuteReader()) { int i = 0; while (reader.Read()) { Customer c = new Customer(); c.ID = reader.GetInt32(reader.GetOrdinal("ID")); c.Name = reader.GetString(reader.GetOrdinal("Name")); Console.WriteLine("{0}: {1}", c.ID, c.Name); i++; if (i > 3) break; } } } 

2.) If you are executing a large Linq-to-SQL query, is there a way to prevent the DataContext from populating by changing the tracking information for each object returned.

If your intention for a particular query is to use it for read-only purposes, you can turn off object tracking for better performance by setting the DataContext.ObjectTrackingEnabled : false property :

 using (var dc = new MyDataContext()) { dc.ObjectTrackingEnabled = false; // do stuff } 

You can also read this MSDN topic: How to get information as read-only (LINQ to SQL) .

+7
source

All Articles