I have the following problematic situation. The data bouquet is split into 10k small files (approximately 8-16 kilobytes each). Depending on user input, I have to load them as quickly as possible and process them. More precisely, each data packet can be divided into 100-100 thousand files, and there are about 1 thousand data packets. However, most of them are smaller.
Now I use a thread pool and every access to the file, the next free stream opens the file, reads it and returns the data prepared for display. Since the number of files will grow in the future, I am not so pleased with this approach, especially if it most likely ends with something around 100 thousand or more files (deploying this, of course, will be fun;)).
So, the idea is to combine all these tiny files for one data packet into a large one and read from it. I can guarantee that it will be read-only, but I do not know the number of threads that will access the same file at the same time (I know the maximum number). This will give me about 1000 files of good size, and I can easily add new data packets.
The question is: how can I allow 1..N threads to be read efficiently from a single file in this scenario? I can use asynchronous I / O on Windows, but it should become synchronous for reading less than 64k. Memory matching the file is not an option, as the expected size is> 1.6 GB, and I still need to be able to run on x86 (if I cannot efficiently display a small part, read it, undo it again), my experience working with memory mapping was that it brought quite overhead compared to one reading).
I thought about opening each of the data packets N times and gave each thread a handle in cyclic mode, but the problem is that it can end (with the number of data files) x (maximum number of threads) open handles (it can easily become 8-16k), and I will need to synchronize every access to the data packet or use incorrect magic to get the next free file descriptor.
Since this does not seem to be the original problem (I think that any database engine has a similar structure, where you can have M tables (data packets) with N rows (files in my case), and you want to allow as many threads as possible for reading lines at the same time). So what is the recommended practice here? BTW, it should work on Windows and Linux, so portable approaches are welcome (or at least approaches that work on both platforms, even if they use different basic APIs - if they can be wrapped up, I'm happy).
[ EDIT ] It's not about speed, it's about hiding the delay. That is, I read as 100 of these tiny files per second, maybe, so I am at 1 mib / s at best. My main problem is search time (since my access pattern is not predictable), and I want to hide them by disabling reading, showing old data to the user. The question is how to allow multiple threads to issue IO requests across multiple files, possibly> 1 thread, referring to a single file.
It really is not a problem if one of the calls takes 70 ms or so to complete, but I cannot afford it if the lock is read.