From a programmerβs point of view, blocking I / O is easier to use than non-blocking I / O. You simply call the read / write function, and when it returns, you are done. With non-blocking I / O, you need to check whether you can read / write, then read / write, and then check the return values. If not everything has been read or written, you will need mechanisms to read or write again or later when the recording can be done.
In terms of performance: non-blocking I / O on a single thread is no faster than blocking I / O on a single thread. The speed of an I / O operation is determined by the device (for example, a hard disk) that is read or written to. The speed is not determined by someone who is waiting (blocked) or not waiting (non-blocking). Also, if you call a blocking I / O function, the OS can effectively block the lock. If you need to do lock / wait in an application, you can do it almost as good as the OS, but you can also make it worse.
So why do programmers make life difficult and implement non-blocking I / O? Because, and this is a key point, their program has more features than just one I / O operation. When using I / O lock, you need to wait until the block I / O is complete. When using non-blocking I / O, you can perform some calculations until blocking I / O is performed. Of course, during non-blocking I / O, you can also start other I / O operations (blocking or non-blocking).
Another approach to non-blocking I / O is to add more threads with I / O blocking, but as the SO message you linked goes with Cost. This cost is higher than the cost of (non-supported OS) non-blocking I / O.
If you have an application with massive I / O, but only with low CPU consumption, such as a web server with many clients, then use multiple threads with non-blocking I / O. When blocking I / O, you get a lot of threads -> high costs, so use only a few threads -> requires non-blocking I / O.
If you have an application with an intensive processor, such as a program that reads a file, performs intensive calculations on the full data and writes the result to a file, then 99% of the time will be spent on part of the processor. Therefore, create multiple threads (for example, one per processor) and do a parallel computation. As for I / O, you are likely to block I / O blocking because it is easier to implement and because the program has nothing to do with it.
If you have an application with heavy CPU usage and I / O intensity, you can also use multiple threads and non-blocking I / O. You might think of a web server with lots of clients and web page requests where you do heavy computing in a cgi script. While waiting for I / O on connection, the program can calculate the result for another connection. Or think of a program that reads a large file and can do intensive calculations on chunks of a file (for example, calculating an average value or adding 1 to all values). In this case, you can use non-blocking reads, and, waiting for the next read to finish, you can already calculate the available data. If the result file is just a small compressed value (for example, an average value), you can use a lock record for the result. If the result file is sized as the input file and as βall values ββ+1β, you can write the results without locking, and during the recording, you can perform calculations in the next block.