TPL DataFlow vs BlockingCollection

Question

TPL DataFlow vs BlockingCollection

I understand that BlockingCollection best for consumer / producer pattern. However, when do I use the ActionBlock from the TPL DataFlow library ?

My initial understanding is for I / O operations, keep a BlockingCollection while intensive CPU operations are performed for an ActionBlock . But I feel that this is not the whole story ... Any additional understanding?

+6

.net task-parallel-library tpl-dataflow data-synchronization

Andrew Jan 16 '14 at 13:44

source share

2 answers

i3arnon · Answer 1 · 2014-01-16T15:10:05+0000

TPL Dataflow is better suited for actor-based design. This means that if you want to connect manufacturers and consumers with TDF much easier.

Another big plus for the TPL data stream is that it was created with async in mind. You can produce and consume synchronously and async (and both at the same time), which is very useful. (I mainly produce synchronously and consume the non-blocking async method).

You can also easily set limited capacity and degree of parallelism.

TL DR: BlockingCollection is a simple and general tool. TPL Dataflow is much more reliable, but can be excessive or poorly suited to specific problems.

bornfromanegg · Answer 2 · 2014-02-09T11:51:22+0000

Not sure if reusing Block is confusing. These are different things.

You are right, BlockingCollection is well suited for the situation with the manufacturer, since it will block the attempt to read from it until the data is available. However, the BlockingCollection is not part of the TPL data stream. It was introduced in .NET 4.0 as one of the new types of streaming security.

An ActionBlock, however, is a “Block” type defined by the TPL data stream and can be used to perform an action. A block, in this sense, refers more to its use as part of a data stream.

The data streams defined in the TPL data stream are block-based, and there are three main types. From the documentation:

The TPL data flow library consists of data flow blocks, which are data structures that buffer and process data. TPL defines three types of data flow blocks: source blocks, target blocks, and propagator blocks. The source block acts as a data source and can be read. The target block acts as a data receiver and can be written to. The propagator block acts as a source block and a target block, and can be read and written. TPL defines the System.Threading.Tasks.Dataflow.ISourceBlock interface for presenting sources, System.Threading.Tasks.Dataflow.ITargetBlock for presenting goals and System.Threading.Tasks.Dataflow.IPropagatorBlock for presenting propagators. IPropagatorBlock inherits from both ISourceBlock and TargetBlock. The TPL stream library provides several predefined data stream block types that implement the ISourceBlock, ITargetBlock, and IPropagatorBlock interfaces. These types of data flow blocks are described in this document in the section "Predefined types of data flow blocks".

ActionBlock is an ITargetBlock type that takes input, performs an action, and then stops.

To answer your first question, I would think that you can use BlockingCollection when your process is simple. You should use the TPL data stream when your process becomes more complex, in which case you probably won't need a BlockingCollection.

The following are examples of the Producer-Consumer problem using BlockingCollection: http://blogs.msdn.com/b/csharpfaq/archive/2010/08/12/blocking-collection-and-the-producer-consumer-problem.aspx ? Redirected = true and here: http://programmerfindings.blogspot.co.uk/2012/07/producer-consumer-problem-using-tpl-and.html

None of them use Dataflow. The following is an example of using Dataflow here:

http://msdn.microsoft.com/en-us/library/hh228601(v=vs.110).aspx

Plus, I highly recommend reading the TPL Dataflow documentation here:

http://msdn.microsoft.com/en-us/library/hh228601(v=vs.110).aspx

if you implement something complicated.