Task.StartNew () vs Parallel.ForEach: several web request scripts

I read all the related questions in SO, but got a little confused about the best approach to my scenario when several web service calls fired.

I have an aggregator service that accepts input, parses and translates it into multiple web requests, makes web request requests (unrelated, so they can be run in parallel), and combines the response that is sent back to the caller. The following code is in use right now -

list.ForEach((object obj) => { tasks.Add(Task.Factory.StartNew((object state) => { this.ProcessRequest(obj); }, obj, CancellationToken.None, TaskCreationOptions.AttachedToParent, TaskScheduler.Default)); }); await Task.WhenAll(tasks); 

await Task.WhenAll(tasks) comes from the Scott Hanselman post , which states that

"The best solution in terms of scalability, says Stephen, is to use asynchronous I / O. When you call through the network there is no reason (other than convenience) to block threads, waiting for a response to return"

Existing code consumes too many threads, and CPU time increases up to 100% the workload, and it makes me think.

Another alternative is to use Parallel.ForEach, which the sectionist uses, but also “blocks” the call, which is great for my scenario.

Given that all this is “Async IO work” and not “processor-related” work, and web requests do not work for a long time (return no more than 3 seconds), I am inclined to believe that the existing code is good enough. But it will provide better bandwidth than Parallel.ForEach? Parallel.ForEach probably uses the minimum number of tasks due to the partitioning and, therefore, the optimal use of threads (?). I tested Parallel.ForEach with some local tests and it seems not to be better.

The goal is to reduce processor time and increase throughput and therefore increase scalability. Is there a better approach to parallel processing of web requests?

Rate any inputs, thanks.

EDIT: The ProcessRequest method shown in the sample code does indeed use HttpClient and its asynchronous methods for send requests (PostAsync, GetAsync, PutAsync).

+7
multithreading c # parallel-processing task-parallel-library parallel.foreach
source share
3 answers

calls web request calls (unrelated, so you can run them in parallel)

What you really want is to call them at the same time, and not in parallel. That is, "at the same time" and not "using multiple threads."

Existing code consumes too many threads

Yes, I think so too.:)

Given that this is all “Async IO” and not “processor operation”

Then all this should be done asynchronously and not use the parallelism task or other parallel code.

As Antii noted, you should make asynchronous asynchronous code:

 public async Task ProcessRequestAsync(...); 

Then what you want to do is use it asynchronously concurrency ( Task.WhenAll ), and not parallel concurrency ( StartNew / Run / Parallel ):

 await Task.WhenAll(list.Select(x => ProcessRequestAsync(x))); 
+5
source share

If you are attached to a processor (you - "CPU time increases to 100%"), you need to reduce CPU usage. Async IO helps nothing. If something causes a bit more CPU usage (unnoticed here).

Profile the application to find out what takes so much processor time and optimizes this code.

The way you initiate parallelism (Parallel, Task, async IO) does nothing for the effectiveness of the parallel action itself. If you call it asynchronous, the network will not become faster. This is hardware. Also no less CPU usage.

Determine the optimal degree of parallelism experimentally and choose a parallelism method suitable for that degree. If it is a few dozen, then the flows are completely beautiful. If he is in the hundreds seriously considering async IO.

+3
source share

Wrapping synchronous calls inside Task.Factory.StartNew gives you no benefits of async. You should use the correct async functions for better scalability. Notice how Scott Hanselman performs asynchronous functions in the message you refer.

for example

 public async Task<bool> ValidateUrlAsync(string url) { using(var response = (HttpWebResponse)await WebRequest.Create(url).GetResponseAsync()) return response.StatusCode == HttpStatusCode.Ok; } 

Checkout http://blogs.msdn.com/b/pfxteam/archive/2012/03/24/10287244.aspx

So your ProcessRequest method should be implemented as async, for example

 public async Task<bool> ProcessRequestAsync(...) 

then you can just

 tasks.Add(this.ProcessRequestAsync(obj)) 

If you run a task with Task.Factory.StartNew, it does not work as async, even if your ProcessRequest method internally makes asynchronous calls. If you want to use Task.Factory, you must make your lambda asynchronous as well:

 tasks.Add(Task.Factory.StartNew(async (object state) => { await this.ProcessRequestAsync(obj); }, obj, CancellationToken.None, TaskCreationOptions.AttachedToParent, TaskScheduler.Default)); 
0
source share

All Articles