If conversion is the throat of a bottle (which is entirely possible) you should start by using the various options in the standard. Logically, one would expect that they would be very close, but in practice, they are not always:
You have already determined that std::ifstream is too slow.
Converting memory mapped data to std::istringstream almost certainly not a good solution; you will first have to create a row that will copy all the data.
Writing your own streambuf to read directly from memory, without copying (or using the deprecated std::istrstream ) may be the solution, though if the problem is really conversion ... it still uses the same conversion procedures.
You can always try fscanf or scanf in your memory card stream. Depending on the implementation, they may be faster than various istream implementations.
Probably faster than any of them should use strtod . No tokenize is needed for this: strtod skips the leading empty space (including '\n' ), and has an out parameter where it puts the address of the first character is not readable. The final condition is a bit complicated, your loop should probably look something like this:
char * begin; // Set to point to the mmap'ed data ...
// You'll also have to arrange for a '\ 0'
// to follow the data. This is probably
// the most difficult issue.
char * end;
errno = 0;
double tmp = strtod (begin, & end);
while (errno == 0 && end! = begin) {
// do whatever with tmp ...
begin = end;
tmp = strtod (begin, & end);
}
If none of them is fast enough, you will need to consider the actual data. It probably has some additional restrictions, which means that you can write a conversion procedure that is faster than more general ones; for example strtod should handle both fixed and scientific, and it should be 100% accurate, even if there are 17 significant digits. It must also be locale specific. All this added complexity, which means the added code to execute. But be careful: writing an efficient and correct conversion procedure, even for a limited set of input data, is not trivial; you really need to know what you are doing.
EDIT:
Just out of curiosity, I did some tests. In addition to the above solutions, I wrote a simple custom converter that processes only a fixed point (without scientific), with a maximum of five digits after the decimal point, and the value before the decimal should match int :
double convert( char const* source, char const** endPtr ) { char* end; int left = strtol( source, &end, 10 ); double results = left; if ( *end == '.' ) { char* start = end + 1; int right = strtol( start, &end, 10 ); static double const fracMult[] = { 0.0, 0.1, 0.01, 0.001, 0.0001, 0.00001 }; results += right * fracMult[ end - start ]; } if ( endPtr != nullptr ) { *endPtr = end; } return results; }
(If you really use this, you should definitely add some processing error. It was just quickly overturned for experimental purposes, read the test file that I generated, and nothing else.)
The interface exactly matches strtod to simplify coding.
I conducted tests in two environments (on different machines, so the absolute values ββof any time do not matter). I got the following results:
On Windows 7 compiled with VC 11 (/ O2):
Testing Using fstream directly (5 iterations)... 6.3528e+006 microseconds per iteration Testing Using fscan directly (5 iterations)... 685800 microseconds per iteration Testing Using strtod (5 iterations)... 597000 microseconds per iteration Testing Using manual (5 iterations)... 269600 microseconds per iteration
On Linux 2.6.18, compiled with g ++ 4.4.2 (-O2, IIRC):
Testing Using fstream directly (5 iterations)... 784000 microseconds per iteration Testing Using fscanf directly (5 iterations)... 526000 microseconds per iteration Testing Using strtod (5 iterations)... 382000 microseconds per iteration Testing Using strtof (5 iterations)... 360000 microseconds per iteration Testing Using manual (5 iterations)... 186000 microseconds per iteration
In all cases, I read 554,000 lines, each of which is 3 randomly generated floating point in the range [0...10000) .
The most striking thing is the huge difference between fstream and fscan under Windows (and the relatively small difference between fscan and strtod ). Secondly, how much a simple user-defined conversion function wins on both platforms. The necessary error handling will slow things down a little, but the difference is still significant. I expected some improvements since it does not handle many things, standard conversion procedures (e.g. scientific format, very, very small numbers, Inf and NaN, i18n, etc.), but not many.