Background:
I use Google protobuf and I would like to read / write a few gigabytes of protobuf marshalled data to a file using C ++. Since it is recommended to keep the size of each protobuf object under 1 MB, I decided that the binary stream (illustrated below) written to the file would work. Each offset contains the number of bytes of the next offset until the end of the file is reached. Thus, each protobuf can remain under 1 MB, and I can combine them together with my heart content.
[int32 offset] [protobuf blob 1] [int32 offset] [protobuf blob 2] ... [eof]
I have an implementation that works on Github:
src / glob.hpp
src / glob.cpp
test / readglob.cpp
test / writeglob.cpp
But I feel like I wrote bad code, and I will be grateful for the tips on how to improve it. Thus,
Questions:
- I use
reinterpret_cast<char*> to read / write 32 bit integers to and from binary fstream . Since I use protobuf, I assume that all machines are not very similar. I also claim that int really 4 bytes. Is there a better way to read / write a 32-bit integer to binary fstream given these two limiting assumptions? - When reading from
fstream I create a temporary fixed-length char buffer to then pass that fixed-length buffer to the protobuf library for decoding with ParseFromArray , since ParseFromIstream will consume the entire stream. I would rather just tell the library to read no more than the next N bytes from fstream , but there seems to be no such functionality in protobuf. What would be the most idiomatic way to pass a function no more than N bytes of fstream ? Or is my design upside down enough and should I consider a different approach completely?
Edit:
- @ codymanix : I switched to
char since istream::read requires a char array if I am not mistaken. I also do not use the extract operator >> , since I read that it was a bad form for use with binary streams. Or is this the last piece of fictitious advice? - @ Martin York : Deleted
new / delete in favor of std::vector<char> . glob.cpp now updated. Thanks!
source share