Gamecat's answer is good regarding the class of abstract tasks, but I think calling DoExecute() on a task in the calling thread (like the article itself too) is a bad idea. I will always queue tasks that will be performed by background threads if streaming has not been completely turned off, and here's why.
Consider the following (far-fetched) case when you need to perform three independent procedures related to the processor:
Procedure1_WhichTakes200ms; Procedure2_WhichTakes400ms; Procedure3_WhichTakes200ms;
For the best use of your dual-core system, you want to execute them in two threads. You would limit the number of background threads to one, so with the main thread you have as many threads as there are cores.
Now the first procedure will be executed in the workflow, and it will complete in 200 milliseconds. The second procedure will start immediately and will be executed in the main thread, since one configured workflow is already taken and will be completed in 400 milliseconds. Then, the last procedure will be performed in a workflow that has already slept for 200 milliseconds, and will complete in 200 milliseconds. The total execution time is 600 milliseconds, and in 2/3 of that time, only one of both threads actually did significant work.
You can change the order of procedures (tasks), but in real life it is probably not possible to know in advance how much time each task will take.
Now let's look at a general way to use a thread pool. According to the configuration, you must limit the number of threads in the pool to 2 (the number of cores), use the main thread only for scheduling threads in the pool, and then wait for all tasks to complete. With the above task queue, the 1st thread will perform the first task, the second thread will take the second task. After 200 milliseconds, the first task will be completed, and the first worker thread will take the third task from the pool, which will then be empty. After 400 milliseconds, the second and third tasks will be completed, and the main thread will be unlocked. The total time to complete is 400 milliseconds, with 100% load on both cores during this time.
At least for processor-related threads, it is vitally important to always have a queued job for the OS scheduler. Calling DoExecute() in the main thread interferes with this, and should not be done.