C ++ - structure alignment and STL vectors

I have an outdated data structure whose length is 672 bytes. These structures are stored in a file sequentially, and I need to read them.

Although I can read them one by one, it would be nice to do this:

// I know in advance how many structs to read in vector<MyStruct> bunchOfStructs; bunchOfStructs.resize(numberOfStructs); ifstream ifs; ifs.open("file.dat"); if (ifs) { ifs.read(&bunchOfStructs[0], sizeof(MyStruct) * numberOfStructs); } 

This works, but I think it only works because the size of the data structure is evenly divisible by my complement of aligning the structure of the compiler. I suspect it will break on another compiler or platform.

An alternative would be to use a for loop to read in each structure, one at a time.

Question β†’ When do I need to worry about data alignment? Does the dynamically allocated memory in the vector support using padding, or does the STL ensure that elements are contiguous?

+7
c ++ stl
source share
5 answers

The standard requires that you can create an array of type struct. When you do this, the array should be contiguous. This means that no matter what size is allocated for the structure, it should be one that allows you to create an array of them. To ensure that the compiler can allocate additional space within the structure, but cannot require additional space between structures.

The space for data in vector (usually) is allocated ::operator new (through the Allocator class), and ::operator new needs to allocate space that is correctly aligned for any type of storage.

You can provide your own Allocator and / or overload ::operator new , but if you do, your version is still required to meet the same requirements, so nothing will change in this regard.

In other words, exactly what you want should work as long as the data in the file was created in much the same way as you are trying to read it. If it was created on another machine or using another compiler (or even the same compiler with different flags) you have many potential problems - you may get differences in content, filling in the structure, etc.

Change Given that you do not know if the structures were written in the format expected by the compiler, you not only need to read the structures one at a time - you really need to read the elements in the structures one at a time, then put them into a temporary struct and finally add this filled struct into your collection.

Fortunately, you can overload operator>> to automate most of this. This does not improve speed (for example), but may contain a cleaner code:

 struct whatever { int x, y, z; char stuff[672-3*sizeof(int)]; friend std::istream &operator>>(std::istream &is, whatever &w) { is >> wx >> wy >> wz; return is.read(w.stuff, sizeof(w.stuff); } }; int main(int argc, char **argv) { std::vector<whatever> data; assert(argc>1); std::ifstream infile(argv[1]); std::copy(std::istream_iterator<whatever>(infile), std::istream_iterator<whatever>(), std::back_inserter(data)); return 0; } 
+4
source share

For your existing file, it is best to determine its file format and read each type individually, read and discard any alignment bytes.

It is better not to make any assumptions with the alignment of the structure.

To save new data to a file, you can use something like boost serialization .

+2
source share

In your case, you should be worried about alignment when this can change the structure of your structure. There are two ways to make your code more portable.

First, most compilers have extended attributes or preprocessor directives that allow you to pack the structure into minimal space. This parameter can potentially bias some of the fields within the structure, which can reduce performance, but ensures that it is installed the same on any computer for which you create it. Check your compiler for #pragma pack() documentation. In GCC, you can use __attribute__((__packed__)) .

Secondly, you can add an explicit addition to your structure. This parameter allows you to maintain the performance properties of the original structure, but does not provide a clear idea of ​​the structure structure. For example:

 struct s { u_int8_t field1; u_int8_t pad0[3]; u_int16_t field2; u_int8_t pad1[2]; u_int32_t field3; }; 
+2
source share

More than alignment, you need to worry about endianness . STL guarantees that the storage in vector matches the array, but integer fields in the structure itself will be stored in different formats between the words x86 and RISC.

Regarding the alignment object, google for #pragma pack(1) .

+1
source share

If you write OO code that requires knowledge of the inner workings of a class, you are doing it wrong. You should not take into account the inner workings of the class; you should only assume that the methods and properties work the same on any platform / compiler.

You should probably implement a class that emulates the functionality of a vector (perhaps by subclassing the vector). Acting, perhaps, as an implementation of the "proxy template", it could load only those structures that were called by the caller. This will allow you to simultaneously resolve any issues facing you. This method should make it work on any platform or compiler.

0
source share

All Articles