You should also look at the profile where the bottleneck is.
Perhaps this is at the core, perhaps at your hardware limit. Until you consult him to find out that you stumble in the dark.
EDIT:
Well, then a more thorough answer. According to Boost.Iostreams documentation, basic_file_source is just a wrapper around std::filebuf , which in turn is built on std::streambuf . To quote the documentation:
CopyConstructible and Assignable wrapper for std :: basic_filebuf, open in read-only mode.
streambuf provides a pubsetbuf method (maybe not the best link, but the first google appeared), which you can apparently use to control the size of the buffer.
For example:
#include <fstream> int main() { char buf[4096]; std::ifstream f; f.rdbuf()->pubsetbuf(buf, 4096); f.open("/tmp/large_file", std::ios::binary); while( !f.eof() ) { char rbuf[1024]; f.read(rbuf, 1024); } return 0; }
In my test (optimization off, though) I really got worse performance with a 4096 byte buffer than a 16 byte buffer, but YMMV is a good example of why you should always profile first :)
But, as you say, basic_file_sink does not provide any means to access this, since it hides the underlying filebuf in its private part .
If you think this is wrong, you can:
- Encourage Boost developers to disclose such features, use the mailing list or traffic.
- Create your own
filebuf package that sets the size of the buffer. There's a section in the tutorial that explains the creation of custom sources, which can be a good starting point. - Write your own source based on what all the caching you like does.
Remember that your hard drive, as well as the kernel, already caches and buffers the reading of files, and I donβt think that you will increase caching performance even further.
And finally, a word about profiling. There are many powerful profiling tools for Linux, and I donβt even know half of them by name, but, for example, there is iotop which is kind of neat because it is very easy to use. This is pretty much like the top one, but instead shows disk related metrics. For example:
Total DISK READ: 31.23 M/s | Total DISK WRITE: 109.36 K/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 19502 be/4 staffan 31.23 M/s 0.00 B/s 0.00 % 91.93 % ./apa
tells me that my program spends more than 90% of its time waiting for IO, i.e. associated with IO. If you need something more powerful, I'm sure Google will help you.
And remember that benchmarking in hot or cold cache greatly affects the result.