Clogging asynchronous tasks "

I recently started working on trying to massage a website for archiving purposes, and I thought it would be nice if several web requests worked asynchronously to speed things up (10,000,000 pages, certainly a lot for archiving), and therefore I ventured into the stern mistress of parallelism, after three minutes I begin to wonder why the tasks that I create (via Task.Factory.StartNew ) are โ€œcloggedโ€.

Annoyed and intrigued, I decided to check this to make sure that it was not just the result of circumstances, so I created a new console project in VS2012 and created this:

 static void Main(string[] args) { for (int i = 0; i < 10; i++) { int i2 = i + 1; Stopwatch t = new Stopwatch(); t.Start(); Task.Factory.StartNew(() => { t.Stop(); Console.ForegroundColor = ConsoleColor.Green; //Note that the other tasks might manage to write their lines between these colour changes messing up the colours. Console.WriteLine("Task " + i2 + " started after " + t.Elapsed.Seconds + "." + t.Elapsed.Milliseconds + "s"); Thread.Sleep(5000); Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Task " + i2 + " finished"); }); } Console.ReadKey(); } 

What at startup came up with this result:

Test results

As you can see, the first four tasks begin with a quick sequence with a time of ~ 0.27, however, after that the tasks begin to increase dramatically in the time it takes to start them.

Why is this happening and what can I do to fix or circumvent this limitation?

+7
c # asynchronous parallel-processing task
source share
2 answers

Tasks (by default) run in a stream pool that sounds just like a stream pool. The thread pool is optimized for many situations, but throwing a Thread.Sleep there is likely to trigger the key in most cases. Also, Task.Factory.StartNew is usually a bad idea to use because people donโ€™t understand how it works. Try instead:

 static void Main(string[] args) { for (int i = 0; i < 10; i++) { int i2 = i + 1; Stopwatch t = new Stopwatch(); t.Start(); Task.Run(async () => { t.Stop(); Console.ForegroundColor = ConsoleColor.Green; //Note that the other tasks might manage to write their lines between these colour changes messing up the colours. Console.WriteLine("Task " + i2 + " started after " + t.Elapsed.Seconds + "." + t.Elapsed.Milliseconds + "s"); await Task.Delay(5000); Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Task " + i2 + " finished"); }); } Console.ReadKey(); } 

Additional explanations:

Threadpool has a limited number of threads. This number varies depending on certain conditions, however, in general, this is true. For this reason, you should never block anything in threadpool (if you want to achieve parallelism). Thread.Sleep is a great example of a blocking API, but so are most web request APIs if you are not using newer versions of asynchronous programming.

So the problem in the source workaround is probably the same as in the example you posted. You block all threads of the thread pools and thus force you to spin new threads and end the clogging.

Extra goodies

By the way, using Task.Run in this way also easily allows you to rewrite the code so that you can know when it will be completed. By keeping a link to all running tasks and waiting for them all at the end (this does not interfere with parallelism), you can reliably find out when all tasks are completed. The following shows how to do this:

 static void Main(string[] args) { var tasks = new List<Task>(); for (int i = 0; i < 10; i++) { int i2 = i + 1; Stopwatch t = new Stopwatch(); t.Start(); tasks.Add(Task.Run(async () => { t.Stop(); Console.ForegroundColor = ConsoleColor.Green; //Note that the other tasks might manage to write their lines between these colour changes messing up the colours. Console.WriteLine("Task " + i2 + " started after " + t.Elapsed.Seconds + "." + t.Elapsed.Milliseconds + "s"); await Task.Delay(5000); Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Task " + i2 + " finished"); })); } Task.WaitAll(tasks.ToArray()); Console.WriteLine("All tasks completed"); Console.ReadKey(); } 

Note: this code has not been tested.

More details

Learn more about Task.Factory.StartNew and why it should be avoided: http://blog.stephencleary.com/2013/08/startnew-is-dangerous.html .

+9
source share

I think this is because you have exhausted all available threads in the thread pool. Try running your tasks using TaskCreationOptions.LongRunning . More details here .

Another problem is that you are using Thread.Sleep , this blocks the current thread and its waste of resources. Try to wait asynchronously using await Task.Delay . You may need to change your lambda async .

 Task.Factory.StartNew(async () => { t.Stop(); Console.ForegroundColor = ConsoleColor.Green; //Note that the other tasks might manage to write their lines between these colour changes messing up the colours. Console.WriteLine("Task " + i2 + " started after " + t.Elapsed.Seconds + "." + t.Elapsed.Milliseconds + "s"); await Task.Delay(5000); Console.ForegroundColor = ConsoleColor.Yellow; Console.WriteLine("Task " + i2 + " finished"); }); 
+1
source share

All Articles