I think you pretty much hit a nail on your head.
Parallel operations as a whole are always compressed by the point at which you run out of resources for parallel operations, but even then you still have diminishing returns on the increasing number of parallel threads.
Jeff Atwood wrote an interesting tweet that I’ll add to this later, showing diminishing returns from over-threaded multi-threaded processors. Of course, this is not exactly the same. But let's look at this for reasons that even if you had 100 files on 100 hard drives, somewhere that the IO gets reset down one channel, which will lead to some reduction in the increase in reading.
The fact that I'm basically trying to say just running something in parallel does not mean that it will be accelerated, it is important to consider how parallel processes actually work.
msarchet
source share