The default options for Parallel.ForEach work well when the task is tied to the CPU and scales linearly . When the task is related to the CPU, everything works fine. If you have a quad-core processor and no other processes, then Parallel.ForEach uses all four processors. If you have a quad-core processor and some other process on your computer uses one full processor, then Parallel.ForEach uses about three processors.
But if the task is not tied to the CPU, then Parallel.ForEach continues to run the tasks, trying to hold all the processors. But no matter how many tasks are executed in parallel, there is always more unused CPU power, and therefore it continues to create tasks.
How can you determine if your task is CPU related? Hopefully just inspecting it. If you factor prime numbers, this is obvious. But other cases are not so obvious. An empirical way to determine if your task is processor-related is to limit the maximum degree of parallelism ParallelOptions.MaximumDegreeOfParallelism and observe how your program works. If your task is processor-related, you should see the same pattern in a quad-core system:
ParallelOptions.MaximumDegreeOfParallelism = 1 : use one full processor or 25% of the processor load.ParallelOptions.MaximumDegreeOfParallelism = 2 : use two processors or 50% of the processor load.ParallelOptions.MaximumDegreeOfParallelism = 4 : use all processors or 100% processor utilization.
If it behaves like this, you can use the default Parallel.ForEach options and get good results. Linear CPU utilization means good task planning.
But if I run the sample application on my Intel i7, I get about 20% of the processor load no matter what maximum degree of parallelism I set. Why is this? So much memory is allocated that the garbage collector blocks threads. An application is tied to resources, and a resource is memory.
Similarly, an I / O-related task that performs lengthy queries against a database server will also never be able to efficiently use all the CPU resources available on the local computer. And in such cases, the task scheduler cannot “know when to stop,” starting new tasks.
If your task is not CPU related or CPU usage does not scale linearly with the maximum degree of parallelism, then you should advise Parallel.ForEach not to run too many tasks at the same time. The easiest way is to specify a number that allows some parallelism to overlap tasks related to I / O binding, but not so much that you suppress the local computer's resource requirements or redirect any remote servers. For best results, trial and error are used:
static void Main(string[] args) { Parallel.ForEach(CreateData(), new ParallelOptions { MaxDegreeOfParallelism = 4 }, (data) => { data[0] = 1; }); }