You need to be careful. It is very easy to get completely fictitious test results for such code, results that are never reproduced in real use. The problem is in the file system cache, it will cache the data that you read from the file. The problem starts when you run your test again and again, tweak the code, and look for improvements.
The second and subsequent time the test starts, data is no longer output from disk. It is still present in the cache; only memory with memory is required to copy it to memory. This is very fast, a microsecond or so overhead plus the time it takes to copy. Which works at speeds of at least 5 gigabytes per second on modern machines.
Now your test will show that you spend a lot of time allocating a buffer and processing data, relative to the amount of time spent reading data.
It is rarely reproduced in real use. The data will not yet be in the cache, now the disk drive should look for data (many milliseconds), and it must be read from the disk drive (at best, several tens of megabytes per second). Reading data now takes a good three out of four times longer. If you manage to make the processing step twice as fast, your program will work only 0.05% faster. Give or take.
Hans passant
source share