How are Reactive Framework, PLINQ, TPL, and concurrent extensions related to each other?

At least since the release of .NET 4.0, Microsoft seems to have put a lot of effort into supporting concurrent and asynchronous programming, and it looks like there are a lot of APIs and libraries. Recently, the following bizarre names have been mentioned everywhere:

  • Reactive Framework
  • PLINQ (Parallel LINQ),
  • TPL (parallel task library) and
  • Parallel extensions.

Now they all seem to be Microsoft products, and they all seem to target asynchronous or parallel programming scripts for .NET. But it is not clear what each of them is and how they are related to each other. Some of them may be the same.

In a few words, can anyone set the recording directly to something?

+62
plinq system.reactive task-parallel-library parallel-extensions
Jan 26 '10 at 9:08
source share
2 answers

PLINQ (Parallel Linq) is just a new way to write regular Linq queries so that they run in parallel - in other words, Framework will automatically take care of running your query in multiple threads, that they will end faster (i.e. use multiple processor cores).

For example, let's say that you have a bunch of lines, and you want to get all those that start with the letter "A". You can write your request as follows:

var words = new[] { "Apple", "Banana", "Coconut", "Anvil" }; var myWords = words.Select(s => s.StartsWith("A")); 

And it works great. If you had 50,000 words to search for, you could take advantage of the fact that each test is independent, and divide it into several cores:

 var myWords = words.AsParallel().Select(s => s.StartsWith("A")); 

This is all you need to do to turn a regular query into a parallel one that runs on multiple cores. Pretty neat.




TPL (parallel task library) is a kind of addition to PLINQ, and together they make parallel extensions. While PLINQ is largely based on a functional programming style without side effects, side effects are exactly what TPL is for. If you want to actually work in parallel, and not just search / select things in parallel, you use TPL.

TPL is essentially a Parallel class that provides For , Foreach and Invoke overloads. Invoke bit like Queue tasks in ThreadPool , but a bit easier to use. IMO, the more interesting bits are For and Foreach . For example, let's say you have a whole bunch of files that you want to compress. You can write a regular serial version:

 string[] fileNames = (...); foreach (string fileName in fileNames) { byte[] data = File.ReadAllBytes(fileName); byte[] compressedData = Compress(data); string outputFileName = Path.ChangeExtension(fileName, ".zip"); File.WriteAllBytes(outputFileName, compressedData); } 

Again, each iteration of this compression is completely independent of any other. We can accelerate this by doing several of them at once:

 Parallel.ForEach(fileNames, fileName => { byte[] data = File.ReadAllBytes(fileName); byte[] compressedData = Compress(data); string outputFileName = Path.ChangeExtension(fileName, ".zip"); File.WriteAllBytes(outputFileName, compressedData); }); 

And again, that is all that is required to parallelize this operation. Now, when we run our CompressFiles method (or as we call it), it will use several processor cores and will probably end in half or 1/4 times.

The advantage of this is that it just throws everything in ThreadPool , so that it actually runs synchronously. If you used ThreadPool instead (or just regular instances of Thread ), you will have to come up with a way to find out when all the tasks are completed, and although it is not very difficult, this is something that many people tend to enter, or at least have problems. When you use the Parallel class, you do not need to think about it; The multi-threaded aspect is hidden from you; all this is handled behind the scenes.




Reactive extensions (Rx) are a completely different beast. This is a different approach to event handling. There really is a lot of material to cover it, but to make the long story short, instead of attaching event handlers to events, Rx allows you to consider event sequences as ... well, sequences ( IEnumerable<T> ). You can handle events in iterative mode, instead of triggering them asynchronously at arbitrary points in time, when you need to constantly save state in order to detect a series of events occurring in a certain order.

One of the coolest examples I've found in Rx is here . Go to the "Linq to IObservable" section, where it implements a drag and drop handler, which is usually a pain in WPF, in just 4 lines of code. Rx gives you a set of events, something that you actually don't have with regular event handlers, and code snippets like these are also easy to refactor into behavior classes that you can embed anywhere.




What is it. These are some of the more sophisticated features available in .NET 4.0. Of course, there are a few more, but these were the ones you asked for!

+93
Jan 30 '10 at 3:31
source share

I like Aaronaught's answer, but I would say that Rx and TPL solve different problems. Part of what the TPL team has added is thread primitives and significant improvements for runtime building blocks such as ThreadPool. And everything you list is built on top of these primitives and runtime functions.

But TPL and Rx solve two different problems. TPL works best when a program or algorithm is "pulled and lined up." Rx is allocated when a program or algorithm needs to "respond" to data from a stream (for example, mouse input or when receiving a stream of related messages from an endpoint, such as WCF).

You will need the “unit of work” concept from TPL to do work, such as a file system, iterating through a collection, or going through a hierarchy, such as an organization chart. In each of these cases, the programmer can talk about the total amount of work, the work can be divided into pieces of a certain size (Tasks), and in the case of calculations according to the hierarchy, tasks can be “chained” together. Thus, some types of work lend themselves to the TPL model “Hierarchy” tasks "and benefit from plumbing enhancements such as cancellation (see Channel 9 Video on the CancellationTokenSource). TPL also has many controls for specialized domains, such as real-time data processing.

Rx will be what most developers should use. It is WPF applications that can “respond” to external messages, such as external data (IM message flow to the IM client) or external input (for example, the drag and drop example associated with Aaronaught). Under cover, Rx uses thread primitives from TPL / BCL, stream collections from TPL / BCL, and runtime objects such as ThreadPool. In my opinion, Rx is the "highest level" of programming to express your intentions.

Whether the average developer can get his head wrapped around the many intentions that you can express with Rx is not yet visible. :)

But I think that in the next couple of years TPL vs. Rx will be the next debate like LINQ-to-SQL and Entity Framework. There are two API options in one domain that specialize in different scenarios but overlap in many ways. But in the case of TPL and Rx, they really know about each other, and there are built-in adapters for compiling applications and using both frameworks together (for example, the results of feeding from the PLINQ loop to the IObservable Rx stream). For people who haven't done parallel programming, there is a ton of training to speed up.

Update: I have been using TPL and RxNet in my usual work for the last 6 months (from 18 months from the date of my initial answer). My thoughts on choosing TPL and / or RxNet in a mid-tier WCF (LOB business service): http://yzorgsoft.blogspot.com/2011/09/middle-tier-tpl-andor-rxnet.html

+28
02 Feb '10 at
source share



All Articles