Using std: vector as a low level buffer

The usage here is similar. Using read () directly in std: vector , but considering redistribution.

The size of the input file is unknown, so the buffer is redistributed by doubling the size when the file size exceeds the size of the buffer. Here is my code:

#include <vector> #include <fstream> #include <iostream> int main() { const size_t initSize = 1; std::vector<char> buf(initSize); // sizes buf to initSize, so &buf[0] below is valid std::ifstream ifile("D:\\Pictures\\input.jpg", std::ios_base::in|std::ios_base::binary); if (ifile) { size_t bufLen = 0; for (buf.reserve(1024); !ifile.eof(); buf.reserve(buf.capacity() << 1)) { std::cout << buf.capacity() << std::endl; ifile.read(&buf[0] + bufLen, buf.capacity() - bufLen); bufLen += ifile.gcount(); } std::ofstream ofile("rebuild.jpg", std::ios_base::out|std::ios_base::binary); if (ofile) { ofile.write(&buf[0], bufLen); } } } 

The program prints the vector capacity exactly as expected, and writes the output file of the same size as the input, BUT, with only the same bytes as the input before the initSize offset, and all zeros after ...

Using &buf[bufLen] in read() is definitely undefined behavior, but &buf[0] + bufLen gets the correct write to write because it guarantees continuous distribution, right? (assuming initSize != 0 Note that std::vector<char> buf(initSize); sizes buf to initSize . And yes, if initSize == 0 , a fatal rumtime error occurs in my environment.) missed something? Is this also UB? Does the standard talk about this using std :: vector?

PS: Yes, I know that we can first calculate the file size and allocate exactly the same buffer size, but in my project you can expect that the input files will ALWAYS be less than a certain SIZE , so I can set initSize to SIZE and not expect overheads costs (for example, calculating file size) and use redistribution only for “exception handling”. And yes, I know that I can replace reserve() with resize() and capacity() with size() , and then make things work with a little overhead (zero buffer every time I resize), but I still want to get rid from any excess operation, just paranoid ...

updated 1:

In fact, we can logically infer from the standard that &buf[0] + bufLen receives the correct message, consider:

 std::vector<char> buf(128); buf.reserve(512); char* bufPtr0 = &buf[0], *bufPtrOutofRange = &buf[0] + 200; buf.resize(256); std::cout << "standard guarantees no reallocation" << std::endl; char* bufPtr1 = &buf[0], *bufInRange = &buf[200]; if (bufPtr0 == bufPtr1) std::cout << "so bufPtr0 == bufPtr1" << std::endl; std::cout << "and 200 < buf.size(), standard guarantees bufInRange == bufPtr1 + 200" << std::endl; if (bufInRange == bufPtrOutofRange) std::cout << "finally we have: bufInRange == bufPtrOutofRange" << std::endl; 

exit:

 standard guarantees no reallocation so bufPtr0 == bufPtr1 and 200 < buf.size(), standard guarantees bufInRange == bufPtr1 + 200 finally we have: bufInRange == bufPtrOutofRange 

And here 200 can be replaced by each buf.size() <= i < buf.capacity() and a similar conclusion occurs.

updated 2:

Yes, I missed something ... But the problem is not continuity (see update 1), and even if there is no write to memory. Today I have time to study the problem, the program received the correct address, wrote the necessary data to the reserved memory, but in the next reserve() , buf redistributed and ONLY elements in the range [0, buf.size()) copied to the new memory. So this is the answer to the whole riddle ...

Final note. If you do not need to redistribute after filling the buffer with some data, you can definitely use reserve()/capatity() instead of resize()/size() , but if you need to, use the latter.

Example:

 const size_t initSize = 32; std::vector<char> buf(initSize); buf.reserve(1024*100); // reserve enough space for file reading std::ifstream ifile("D:\\Pictures\\input.jpg", std::ios_base::in|std::ios_base::binary); if (ifile) { ifile.read(&buf[0], buf.capacity()); // ok. the whole file is read into buf std::ofstream ofile("rebuld.jpg", std::ios_base::out|std::ios_base::binary); if (ofile) { ofile.write(&buf[0], ifile.gcount()); // rebuld.jpg just identical to input.jpg } } buf.reserve(1024*200); // horror! probably always lose all data in buf after offset initSize 

PS: I did not find any authoritative sources (standard, TC ++ PL, etc.) that clearly agree or disagree with the above proposal that I made. But with all the implementations available here (VC ++, g ++, ICC), the above example works just fine.

And here is another example, quoted from "TC ++ PL, 4e" pp 1041, that the first line in the function uses reserve() rather than resize() :

 void fill(istream& in, string& s, int max) // use s as target for low-level input (simplified) { s.reserve(max); // make sure there is enough allocated space in.read(&s[0],max); const int n = in.gcount(); // number of characters read s.resize(n); s.shrink_to_fit(); // discard excess capacity } 
+7
c ++ vector buffer
source share
1 answer

reserve does not actually add space to the vector, it only ensures that you do not need redistribution when changing it. Instead of using reserve you should use resize and then do a final resize as soon as you know how many bytes you really read.

Edit: All that reserve guaranteed is to prevent the iterators and pointers from being invalidated by increasing the size of the vector to capacity() . The contents of these reserved bytes are not guaranteed to be stored unless they are part of size() .

+3
source share

All Articles