What local I / O patterns are stored in the dispatch_io address?

I'm a big fan of Grand Central Dispatch, and recently I was looking at the dispatch_io_* call family. It is easy to understand how this API can be useful for network I / O operations (slow, high latency). However, the existence of the dispatch_io_create_with_path type implies use on the local file system. (Yes, I know that a path can also point to a resource in a remote network file system, but in any case ...) When playing with it, I noticed that using dispatch_io_* seems to incur significant overhead compared to using simple blocking I / O calls ( read , write , etc.). This slowdown, apparently, is mainly due to the core synchronization primitives used as blocks, is marshaled between queues. In the sample workload that I played with (lots of I / O bindings), the slowdown can be as bad as 10 times. On the one hand, it looks like dispatch_io will never be a win for messaging (small granule).

I believe that in the general case of a single machine with a single, local, physical storage volume, I / O requests will be effectively serialized at the device level. From there, I found myself with these two thoughts:

  • If your workload is tied to a processor, then by definition you can already read and write data faster than you can process it.
  • If your workload is related to I / O binding (in this situation, a single, local, physical volume) using dispatch_io cannot make your disk transfer data faster.

From there, I thought that maybe the sweet spot for this API might be in the middle - the workload that flows between the CPU binding and the I / O binding, but for now I kind of thought in the corner, so I decided that I will ask StackOverflow.

I will accept the first answer describing the "real world" workflow with these prerequisites (for example, a single computer, one local physical disk) for which using dispatch_io will provide a significant performance improvement.

+4
source share
1 answer

The main option for using dispatcher I / O from the local file system is asynchronous IO of large files or many files that are read / written at the same time (especially if the contents of the files can be processed gradually).

Dispatcher I / O from local file systems is optimized for throughput for latency (for example, it performs chunking and advisory, reads before current IO system messages to optimize pipelining and throughput on the I / O path through the kernel and driver).

Given the asynchronous execution of system I / O calls in the background thread, the I / O manager will never beat IO input latency with a small file with a blocking system call, especially when no other I / O activity is performed.

The GCD session from WWDC11 contains detailed information about I / O and has an example of comparing the bandwidth improvements achieved over read read () systems for reading many files of various sizes.

+7
source

All Articles