This is probably the disk search time, which is the limiting factor (this is one of the most common bottlenecks when executing Make, which usually includes many small files). The silent file system constructions have an entry in the directory and insist on a pointer to disk blocks for the file, and this guarantees a minimum of 1 file search.
If you use Windows, I would switch to using NTFS (which stores small files in a directory entry (-> save one disk for each file). We also use disk compression (more computing, but processors are cheap and fast, but less space on disk → less read time), it may not be practical if your files are all small. Perhaps the equivalent of the Linux file system if you are there.
Yes, you have to run a bunch of threads to read files:
forall filename in list: fork( open filename, process file, close filename)
You may need to disable this feature to avoid streaming, but I shoot hundreds not 2 or 3. If you do, you tell the OS that it can read a lot of disk space, and it can order multiple requests by posting to drive ( elevator algorithm ), and this will also help minimize head movement.
Ira Baxter
source share