C ++ ifstream :: reading is slow due to memcpy

Recently I decided to optimize the reading of some files that I did, because, as everyone says, reading a large piece of data into the buffer and then working with it is faster than using a large number of small reads. And my code, of course, is much faster, but after doing some profiling, memcpy seems to take a lot of time.

The essence of my code ...

ifstream file("some huge file"); char buffer[0x1000000]; for (yada yada) { int size = some arbitrary size usually around a megabyte; file.read(buffer, size); //Do stuff with buffer } 

I use Visual Studio 11 and after profiling my code, it says ifstream::read() will end up calling xsgetn() , which copies from the internal buffer to my buffer. This operation takes more than 80% of the time! In second place comes uflow() , which takes 10% of the time.

Is there any way around this copying? Can I somehow tell ifstream in order to buffer the size that I need directly in my buffer? Does the C-style FILE* such an internal buffer?

UPDATE: due to what people tell me to use cstdio ... I did a test.

EDIT: Unfortunately, the old code was full of failures (he didn't even read the whole file!). You can see it here: http://pastebin.com/4dGEQ6S7

Here is my new landmark:

 const int MAX = 0x10000; char buf[MAX]; string fpath = "largefile"; int main() { { clock_t start = clock(); ifstream file(fpath, ios::binary); while (!file.eof()) { file.read(buf, MAX); } clock_t end = clock(); cout << end-start << endl; } { clock_t start = clock(); FILE* file = fopen(fpath.c_str(), "rb"); setvbuf(file, NULL, _IOFBF, 1024); while (!feof(file)) { fread(buf, 0x1, MAX, file); } fclose(file); clock_t end = clock(); cout << end-start << endl; } { clock_t start = clock(); HANDLE file = CreateFile(fpath.c_str(), GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_ALWAYS, NULL, NULL); while (true) { DWORD used; ReadFile(file, buf, MAX, &used, NULL); if (used < MAX) break; } CloseHandle(file); clock_t end = clock(); cout << end-start << endl; } system("PAUSE"); } 

Time:
185
80
78

Well ... it seems like using C-style fread is faster than ifstream :: read. In addition, using ReadFile windows provides only a slight advantage, which is negligible (I looked at the code, and fread is basically a wrapper around ReadFile). It seems that I still go to thin.

A person is misleading to write a test that actually tests this material correctly.

CONCLUSION: Using <cstdio> is faster than <fstream> . The reason fstream is slower because C ++ streams have their own internal buffer. This leads to additional copying whenever you read / write and copy accounts for all the time spent on the stream. Even more shocking is that the extra time is more than the time taken to actually read the file.

+8
c ++ performance visual-c ++ memcpy fstream
source share
3 answers

If you want to speed up file input / output, I suggest you use the good ol ' <cstdio> , because it can outperform the C ++ one by a wide margin.

+3
source share

Can I somehow tell ifstream to buffer the size I need into my buffer?

Yes, this is what pubsetbuf () is for.

But if you are interested in copying the file while also looking at the memory mapping, boost has a portable implementation .

+5
source share

It has been proven several times that the fastest way to read data is mmap() for Linux systems. I do not know about Windows. However, it will probably do without this buffering.

fopen() , fread() , fwrite() ( FILE* ) is somewhat higher-level and can call a buffer, while the open() , read() , write() functions are low levels and the only buffer you may have come from the kernel Os.

+1
source share

All Articles