Equivalent Python Generator in C ++ for Buffered Reads

Guido Van Rossum demonstrates the simplicity of Python in this article and uses this function to buffer reads of a file of unknown length:

def intsfromfile(f): while True: a = array.array('i') a.fromstring(f.read(4000)) if not a: break for x in a: yield x 

I need to do the same in C ++ for speed reasons! I have many files containing sorted lists of unsigned 64-bit integers that I need to combine. I found this nice code snippet for merging vectors.

I went in cycles on how to make ifstream for a file with an unknown length, representing itself as vector , which I can happily repeat to the end of the file. reached. Any suggestions? Am I barking the correct tree using istreambuf_iterator ?

+3
c ++ python algorithm file io
Jan 13 2018-11-11T00:
source share
1 answer

To mask an ifstream (or indeed any input stream) in a form that acts like an iterator, you want to use the istream_iterator or istreambuf_iterator template class. The first is useful for files in which formatting is a concern. For example, a file full of integers separated by spaces can be read in the constructor of a range of vector iterators as follows:

 #include <fstream> #include <vector> #include <iterator> // needed for istream_iterator using namespace std; int main(int argc, char** argv) { ifstream infile("my-file.txt"); // It isn't customary to declare these as standalone variables, // but see below for why it necessary when working with // initializing containers. istream_iterator<int> infile_begin(infile); istream_iterator<int> infile_end; vector<int> my_ints(infile_begin, infile_end); // You can also do stuff with the istream_iterator objects directly: // Careful! If you run this program as is, this won't work because we // used up the input stream already with the vector. int total = 0; while (infile_begin != infile_end) { total += *infile_begin; ++infile_begin; } return 0; } 

istreambuf_iterator used to read files one character at a time, not counting input formatting. That is, it will return you all the characters, including spaces, newlines, etc. Depending on your application, this may be more appropriate.

Note. Scott Meyers explains in Effective STL why separate variable declarations are needed for the istream_iterator above. Usually you do something like this:

 ifstream infile("my-file.txt"); vector<int> my_ints(istream_iterator<int>(infile), istream_iterator<int>()); 

However, C ++ actually parses the second line in an incredibly strange way. He sees this as a declaration of a function called my_ints , which takes two parameters and returns a vector<int> . The first parameter is of type istream_iterator<int> and is called infile (brackets are not taken into account). The second parameter is a pointer to a function without a name that takes null arguments (because of the brackets) and returns an object of type istream_iterator<int> .

Pretty cool, but also pretty aggravating if you don't follow him.




EDIT

Here is an example that uses istreambuf_iterator to read in a file of 64-bit numbers laid out from end to end:

 #include <fstream> #include <vector> #include <algorithm> #include <iterator> using namespace std; int main(int argc, char** argv) { ifstream input("my-file.txt"); istreambuf_iterator<char> input_begin(input); istreambuf_iterator<char> input_end; // Fill a char vector with input file contents: vector<char> char_input(input_begin, input_end); input.close(); // Convert it to an array of unsigned long with a cast: unsigned long* converted = reinterpret_cast<unsigned long*>(&char_input[0]); size_t num_long_elements = char_input.size() * sizeof(char) / sizeof(unsigned long); // Put that information into a vector: vector<unsigned long> long_input(converted, converted + num_long_elements); return 0; } 

Now I personally don’t like this solution (using reinterpret_cast , exposing the char_input array), but I'm not familiar enough with istreambuf_iterator to conveniently use one templated over 64-bit characters, which would make it a lot easier.

+7
Jan 13 2018-11-23T00:
source share



All Articles