Running the same linq query on multiple IQueryable in parallel?

Situation: I have a List<IQueryable<MyDataStructure>> . I want to run one linq query on each of them in parallel and then join the results.

Question: How to create a linq query that I can pass as a parameter?

Code example:

Here is some simplified code. Firstly, I have a collection of IQueryable<string> :

  public List<IQueryable<string>> GetQueries() { var set1 = (new List<string> { "hello", "hey" }).AsQueryable(); var set2 = (new List<string> { "cat", "dog", "house" }).AsQueryable(); var set3 = (new List<string> { "cat", "dog", "house" }).AsQueryable(); var set4 = (new List<string> { "hello", "hey" }).AsQueryable(); var sets = new List<IQueryable<string>> { set1, set2, set3, set4 }; return sets; } 

I would like to find all the words that start with the letter "h". With one IQueryable<string> this is easy:

 query.Where(x => x.StartsWith("h")).ToList() 

But I want to run the same query with all IQueryable<string> objects in parallel, and then combine the results. Here is one way to do this:

  var result = new ConcurrentBag<string>(); Parallel.ForEach(queries, query => { var partOfResult = query.Where(x => x.StartsWith("h")).ToList(); foreach (var word in partOfResult) { result.Add(word); } }); Console.WriteLine(result.Count); 

But I want this to be a more general solution. So that I can separately define the linq operation and pass it as a parameter to the method. Something like that:

  var query = Where(x => x.FirstName.StartsWith("d") && x.IsRemoved == false) .Select(x => x.FirstName) .OrderBy(x => x.FirstName); var queries = GetQueries(); var result = Run(queries, query); 

But I am at a loss on how to do this. Any ideas?

+4
source share
2 answers

So, the first thing you wanted was a way to take a sequence of queries, execute all of them, and then get a flattened list of results. It is quite simple:

 public static IEnumerable<T> Foo<T>(IEnumerable<IQueryable<T>> queries) { return queries.AsParallel() .Select(query => query.ToList()) .SelectMany(results => results); } 

For each request, we execute it (we call ToList on it), and it runs in parallel, thanks to AsParallel , and then the results are smoothed into a single sequence through SelectMany .

Another thing you wanted to do was add a few queries to each query in the query sequence. This does not need to be parallelized (due to deferred execution, calls to Where , OrderBy , etc. Almost do not take time) and can be performed only through Select :

 var queries = GetQueries().Select(query => query.Where(x => x.FirstName.StartsWith("d") && !x.IsRemoved) .Select(x => x.FirstName) .OrderBy(x => x.FirstName)); var results = Foo(queries); 

Personally, I really do not see the need to combine these two methods. You can make a method that does both, but they are really quite different concepts, so I don’t see the need for this. If you want them to be combined, here is this:

 public static IEnumerable<TResult> Bar<TSource, TResult>( IEnumerable<IQueryable<TSource>> queries, Func<IQueryable<TSource>, IQueryable<TResult>> selector) { return queries.Select(selector) .AsParallel() .Select(query => query.ToList()) .SelectMany(results => results); } 

Feel free to use either the Foo or Bar extension methods if you want. In addition, you really better rename them to something better if you intend to use them.

+5
source

At first - given the current implementation, there is no reason to use IQueryable<T> - you could just use IEnumerable<T> .

Then you can write a method that accepts IEnumerable<IEnumerable<T>> and Func<IEnumerable<T>, IEnumerable<U>> to build the result:

 IEnumerable<IEnumerable<U>> QueryMultiple<T,U>(IEnumerable<IEnumerable<T>> inputs, Func<IEnumerable<T>,IEnumerable<U>> mapping) { return inputs.AsParallel().Select(i => mapping(i)); } 

Then you can use this as:

 void Run() { IEnumerable<IEnumerable<YourType>> inputs = GetYourObjects(); Func<IEnumerable<YourType>, IEnumerable<YourType>> query = i => i.Where(x => x.FirstName.StartsWith("d") && x.IsRemoved == false) .Select(x => x.FirstName) .OrderBy(x => x.FirstName); var results = QueryMultiple(inputs, query); } 
+4
source

All Articles