Reading a binary file into a structure (C ++)

Question

Reading a binary file into a structure (C ++)

Thus, I am having some problem of being unable to read the binary in my structure correctly. The structure is as follows:

struct Student { char name[25]; int quiz1; int quiz2; int quiz3; };

This is 37 bytes (25 bytes from the char array and 4 bytes per integer). My .dat file is 185 bytes. These are 5 students with 3 whole classes. Therefore, each student takes 37 bytes (37 * 5 = 185).

In text format, it looks something like this:

 Bart Simpson 75 65 70 Ralph Wiggum 35 60 44 Lisa Simpson 100 98 91 Martin Prince 99 98 99 Milhouse Van Houten 80 87 79

I can read each entry separately with this code:

 Student stud; fstream file; file.open("quizzes.dat", ios::in | ios::out | ios::binary); if (file.fail()) { cout << "ERROR: Cannot open the file..." << endl; exit(0); } file.read(stud.name, sizeof(stud.name)); file.read(reinterpret_cast<char *>(&stud.quiz1), sizeof(stud.quiz1)); file.read(reinterpret_cast<char *>(&stud.quiz2), sizeof(stud.quiz2)); file.read(reinterpret_cast<char *>(&stud.quiz3), sizeof(stud.quiz3)); while(!file.eof()) { cout << left << setw(25) << stud.name << setw(5) << stud.quiz1 << setw(5) << stud.quiz2 << setw(5) << stud.quiz3 << endl; // Reading the next record file.read(stud.name, sizeof(stud.name)); file.read(reinterpret_cast<char *>(&stud.quiz1), sizeof(stud.quiz1)); file.read(reinterpret_cast<char *>(&stud.quiz2), sizeof(stud.quiz2)); file.read(reinterpret_cast<char *>(&stud.quiz3), sizeof(stud.quiz3)); }

And I get a nice conclusion, but I want to be able to read one whole structure at a time, and not just individual members of each structure at a time. This code is what, in my opinion, is necessary to complete the task, but ... it does not work (after that I will show the result):

*, not counting similar parts, like opening a file declaration and structure, etc.

 file.read(reinterpret_cast<char *>(&stud), sizeof(stud)); while(!file.eof()) { cout << left << setw(25) << stud.name << setw(5) << stud.quiz1 << setw(5) << stud.quiz2 << setw(5) << stud.quiz3 << endl; file.read(reinterpret_cast<char *>(&stud), sizeof(stud)); }

OUTPUT:

 Bart Simpson 16640179201818317312 ph Wiggum 288358417665884161394631027 impson 129184563217692391371917853806 ince 175193530917020655191851872800

The only part that she did not spoil is the first name, after which she went down the hill. I tried everything and I do not know what happened. I even looked through the books that I have, and could not find anything. Things there are similar to what I have, and they work, but for some odd reason mine doesn’t. I did file.get (ch) (ch being a char) in byte 25, and it returned K, which is ASCII for 75 .., which is the first test score, so that's where it should be. He just doesn't read my structures properly.

Any help would be greatly appreciated, I was just stuck with this.

EDIT: After getting so many unexpected and amazing input from you guys, I decided to take your advice and stick to reading in one member at a time. I did things cleaner and less using functions. Thanks again for providing such quick and enlightening material. It is very much appreciated.

If you are interested in a workaround that is not recommended by most, scroll down to the 3rd answer by user 1654209. This workaround works flawlessly, but read through all the comments to see why it is not approved.

+7

c ++ file struct binary file-io

Bk Mar 21 '13 at 8:32

source share

5 answers

Without seeing the code that writes the data, I assume that you write the data as you read it in the first example, each element one by one. Then each entry in the file will be 37 bytes.

However, since compiler structures break objects into beautiful boundaries for optimization reasons, your structure is 40 bytes. Therefore, when you read the full structure in one call, you are actually reading 40 bytes at a time, which means that your reading will be incompatible with the actual entries in the file.

You need to either re-write to write the complete structure at a time, or use the first reading method, where you read one field of an element at a time.

+4

Some programmer dude Mar 21 '13 at 8:46

source share

A simple workaround is to pack your structure in 1 byte

using gcc

 struct __attribute__((packed)) Student { char name[25]; int quiz1; int quiz2; int quiz3; };

using msvc

 #pragma pack(push, 1) //set padding to 1 byte, saves previous value struct Student { char name[25]; int quiz1; int quiz2; int quiz3; }; #pragma pack(pop) //restore previous pack value

EDIT: as user states ahans: pragma pack is supported by gcc since version 2.7.2.3 (released in 1997), so it seems safe to use the pragma package as the only packed notation if you are targeting msvc and gcc

+3

Oualid jabnoune Mar 21 '13 at 9:03

source share

As you already found out, there is a problem with filling. In addition, as others have suggested, the correct way to solve this is to read each member individually, as you did in your example. I do not expect this to cost much more than just reading all of this. However, if you still want to continue reading and read it once, you can tell the compiler to do the markup differently:

 #pragma pack(push, 1) struct Student { char name[25]; int quiz1; int quiz2; int quiz3; }; #pragma pack(pop)

Using #pragma pack(push, 1) you tell the compiler to save the current value of the package on the internal stack and then use the value of package 1. This means that you get 1 byte alignment, which means that in this case there is no padding at all. Using #pragma pack(pop) you tell the compiler to get the last value from the stack and use it after that, thereby restoring the behavior used by the compiler before defining your struct .

While #pragma usually indicates non-portable, compiler-specific functions, this works with at least GCC and Microsoft VC ++.

+2

ahans Mar 21 '13 at 9:23

source share

There are several ways to solve the problem of this thread. Here is a solution based on using a union of a structure and char buf:

 #include <fstream> #include <sstream> #include <iomanip> #include <string> /* This is the main idea of the technique: Put the struct inside a union. And then put a char array that is the number of chars needed for the array. union causes sStudent and buf to be at the exact same place in memory. They overlap each other! */ union uStudent { struct sStudent { char name[25]; int quiz1; int quiz2; int quiz3; } field; char buf[ sizeof(sStudent) ]; // sizeof calcs the number of chars needed }; void create_data_file(fstream& file, uStudent* oStudent, int idx) { if (idx < 0) { // index passed beginning of oStudent array. Return to start processing. return; } // have not yet reached idx = -1. Tail recurse create_data_file(file, oStudent, idx - 1); // write a record file.write(oStudent[idx].buf, sizeof(uStudent)); // return to write another record or to finish return; } std::string read_in_data_file(std::fstream& file, std::stringstream& strm_buf) { // allocate a buffer of the correct size uStudent temp_student; // read in to buffer file.read( temp_student.buf, sizeof(uStudent) ); // at end of file? if (file.eof()) { // finished return strm_buf.str(); } // not at end of file. Stuff buf for display strm_buf << std::setw(25) << std::left << temp_student.field.name; strm_buf << std::setw(5) << std::right << temp_student.field.quiz1; strm_buf << std::setw(5) << std::right << temp_student.field.quiz2; strm_buf << std::setw(5) << std::right << temp_student.field.quiz3; strm_buf << std::endl; // head recurse and see whether at end of file return read_in_data_file(file, strm_buf); } std::string quiz(void) { /* declare and initialize array of uStudent to facilitate writing out the data file and then demonstrating reading it back in. */ uStudent oStudent[] = { {"Bart Simpson", 75, 65, 70}, {"Ralph Wiggum", 35, 60, 44}, {"Lisa Simpson", 100, 98, 91}, {"Martin Prince", 99, 98, 99}, {"Milhouse Van Houten", 80, 87, 79} }; fstream file; // ios::trunc causes the file to be created if it does not already exist. // ios::trunc also causes the file to be empty if it does already exist. file.open("quizzes.dat", ios::in | ios::out | ios::binary | ios::trunc); if ( ! file.is_open() ) { ShowMessage( "File did not open" ); exit(1); } // create the data file int num_elements = sizeof(oStudent) / sizeof(uStudent); create_data_file(file, oStudent, num_elements - 1); // Don't forget file.flush(); /* We wrote actual integers. So, you cannot check the file so easily by just using a common text editor such as Windows Notepad. You would need an editor that shows hex values or something similar. And integrated development invironment (IDE) is likely to have such an editor. Of course, not always so. */ /* Now, read the file back in for display. Reading into a string buffer for display all at once. Can modify code to display the string buffer wherever you want. */ // make sure at beginning of file file.seekg(0, ios::beg); std::stringstream strm_buf; strm_buf.str( read_in_data_file(file, strm_buf) ); file.close(); return strm_buf.str(); }

Call quiz () and get a string formatted for display in std :: cout, write to a file, or something else.

The basic idea is that all elements within the union begin with the same address in memory. That way you can have char or wchar_t buf, which is the same size as the structure you want to write to or read from a file. And note that you need zero clips. There is no act in the code.

I also did not have to worry about the supplement.

For those who don't like recursion, sorry. Working with recursion is easier and less error prone. Maybe not easier for others? Recursions can be converted to loops. And they will need to be converted into loops for very large files.

For those who love recursion, this is another example of using recursion.

I am not saying that using a connection is the best solution or not. It seems to be a solution. Maybe you like it?

0

Indinfer Sep 29 '16 at 2:41

source share

Jasond · Accepted Answer · 2013-03-21T08:37:33+0000

Your structure has almost certainly been supplemented to keep the alignment of its contents. This means that it will not be 37 bytes, and this mismatch causes the read to not sync. Considering how each line loses 3 characters, it seems that it was filled up to 40 bytes.

Since the indentation is likely to be between the line and the integers, even the first record is not read correctly.

In this case, I would recommend that you not try to read your data as a binary blob and stick to reading individual fields. It is much more reliable, especially if you even want to change your structure.

Reading a binary file into a structure (C ++)

More articles: