Add a large list of integers to your LINQ query

I have a LINQ query that returns the following error: "The table data stream (TDS) (RPC) incoming stream protocol stream is too threadless. Too many parameters were specified in this RPC request. Maximum 2100."

All I need to do is to count all customers who have a date of birth, I have my identifier in the list. My list of customer IDs can be huge (millions of entries).

Here is the request:

List<int> allClients = GetClientIDs(); int total = context.Clients.Where(x => allClients.Contains(x.ClientID) && x.BirthDate != null).Count(); 

When a request is rewritten in this way

 int total = context .Clients .Count(x => allClients.Contains(x.ClientID) && x.BirthDate != null); 

it causes the same error.

Also tried to do it in different ways, and he eats all his memory:

 List<int> allClients = GetClientIDs(); total = (from x in allClients.AsQueryable() join y in context.Clients on x equals y.ClientID where y.BirthDate != null select x).Count(); 
+4
source share
5 answers

Well, as Geert Arnold mentioned earlier, making a request in pieces solves the problem, but it looks unpleasant:

 List<int> allClients = GetClientIDs(); int total = 0; const int sqlLimit = 2000; int iterations = allClients.Count() / sqlLimit; for (int i = 0; i <= iterations; i++) { List<int> tempList = allClients.Skip(i * sqlLimit).Take(sqlLimit).ToList(); int thisTotal = context.Clients.Count(x => tempList.Contains(x.ClientID) && x.BirthDate != null); total = total + thisTotal; } 
0
source

We encountered this problem at work. The problem is that list.Contains() creates a WHERE column IN (val1, val2, ... valN) statement WHERE column IN (val1, val2, ... valN) , so you are limited by how many values ​​you can put there. What we finished actually did it in the games just like you did.

However, I think I can offer you a cleaner and more elegant code to do this. The following is an extension method that will be added to other Linq methods that you usually use:

 public static IEnumerable<IEnumerable<T>> BulkForEach<T>(this IEnumerable<T> list, int size = 1000) { for (int index = 0; index < list.Count() / size + 1; index++) { IEnumerable<T> returnVal = list.Skip(index * size).Take(size).ToList(); yield return returnVal; } } 

Then you use it as follows:

 foreach (var item in list.BulkForEach()) { // Do logic here. item is an IEnumerable<T> (in your case, int) } 

EDIT
Or, if you want, you can make it act like a normal List.ForEach () as follows:

 public static void BulkForEach<T>(this IEnumerable<T> list, Action<IEnumerable<T>> action, int size = 1000) { for (int index = 0; index < list.Count() / size + 1; index++) { IEnumerable<T> returnVal = list.Skip(index * size).Take(size).ToList(); action.Invoke(returnVal); } } 

Used as follows:

 list.BulkForEach(p => { /* Do logic */ }); 
+1
source

As stated above, your request probably translates into:

 select count(1) from Clients where ClientID = @id1 or ClientID = @id2 -- and so on up to the number of ids returned by GetClientIDs. 

You will need to modify your request so that you do not pass as many parameters to it.

To see the generated SQL, you can set Clients.Log = Console.Out , which will force it to write to the debug window when it is executed.

EDIT:

A possible alternative to chunking would be to send the identifiers to the server as a delimited string and create a UDF in your database that might hide that string in the list.

 var clientIds = string.Jon(",", allClients); var total = (from client in context.Clients join clientIds in context.udf_SplitString(clientIds) on client.ClientId equals clientIds.Id select client).Count(); 

There are many examples at Google for UDF that split strings.

0
source

Another alternative, and probably the fastest at the time of the query, is to add your numbers from the CSV file to a temporary table in your database, and then complete the connection request.

Fulfilling the query in pieces means many round trips between your client and the database. If the list of identifiers you are interested in is static or rarely changes, I recommend a temporary table approach.

0
source

If you do not mind moving work from the database to the application server and have memory, try this.

int total = context.Clients.AsEnumerable (). Where (x => allClients.Contains (x.ClientID) & x.BirthDate! = Null) .Count ();

0
source

All Articles