Writing a class / structure that changes frequently

Summary:
I have a structure that is read / written to a file.
This structure often changes, and this causes my read() function to become complicated.

I need to find a good way to handle the changes, saving the number of errors. Optimally, the code should facilitate the search for changes between versions.

I thought of a couple of templates, but I don’t know if I went through all the possible options.

As you will see, the code was mostly in C -like, but I am turning it into C++ .


More details
As I said, my structure often changes (in almost every version of the program).

  • Some members are deleted, some members are added, some of them become more complex. This is not a simple case when a new member appears in the structure.

So far, changes in the structure have been processed as follows:

  • in version_1 , I used the color map table:
 struct Obj { int color_index; }; void Read_Obj( File *f, Obj *o ) { f->read( f, &o->color_index ); } void Write_Obj( File *f, Obj *o ) { f->write( f, o->color_index ); } 
  • in the next version , I changed it to the form [r, g, b]
 struct Obj { int color_r; int color_g; int color_b; }; void Read_Obj( File *f, Obj *o ) { if( f->version() == File::Version1 ) { int color_index; f->read( f, &color_index ); ColorIndex_to_RGB( o, color_index ); // we used color maps back then } else { f->read( f, &o->color_r ); f->read( f, &o->color_g ); f->read( f, &o->color_b ); } } void Write_Obj( File *f, Obj *o ) { f->write( f, o->color_r ); f->write( f, o->color_g ); f->write( f, o->color_b ); } 

[short note]

Note that I know I could use

 void Read_Obj( File *f, Obj *o ) { if( f->version() == File::Version1 ) { Read_Obj_V1( f, o ); } else { Read_Obj_V2( f, o ); } } 

but this leads to duplication of code between each of the sub-functions, since in real life only 1-2 of ~ 20 structure members change in each version. Thus, the remaining 18 lines remain unchanged.

Of course, I can change this policy, if for good reason

[end of brief note]


Now these structures have become complex, and I need to convert them to a class and work in a more object-oriented way.

I saw a template in which you use one class to read for each old version, and then convert the data to a new class.

 class Obj_v1 { int m_color_index; read( File *f ) { f->read( f, &m_color_index ); } void convert_to( Obj * ) { /* code to convert the older object */ } }; class Obj { int m_r; int m_g; int m_b; read( File *f ) { f->read( f, &m_r ); f->read( f, &m_g ); f->read( f, &m_b ); } }; void Read_Obj( File *f, Obj *o ) { if( f.version() == File::Version1 ) { Obj_v1 old(); old.read( f ); old.convert_to( o ); } else { o.read( f ); } } void Write_Obj( File *f, Obj *o ) { o->write( f ); } 

However, there are two strategies to solve the problem:

Strategy 1 : direct conversions

 void Read_Obj( File *f, Obj *o ) { if( f->version() == File::Version1 ) { Obj_v1 old(); old.read( f ); old.convert_to( o ); } else if( f->version() == File::Version2 ) { Obj_v2 old(); old.read( f ); old.convert_to( o ); } else { o.read( f ); } } 

Inconvenience:

  • This means that you must update the convert_to() all Obj_vX classes every time you change the Obj class. Too many opportunities for errors that occur at any given time.

Allowance:

  • You can always put the old concept (structure) into a new one - compare with a cascading strategy (hereinafter), where some information may be lost along the way, therefore it cannot be used.

Strategy 2 : cascading transformations

 void Read_Obj( File *f, Obj *o ) { Obj_v1 o1(); Obj_v2 o2(); if( f->version() == File::Version1 ) { o1.read( f ); o1.convert_to( o2 ); o2.convert_to( o ); } else if( f->version() == File::Version2 ) { o2.read( f ); o2.convert_to( o ); } else { o.read( f ); } } 

Disadvantages:

  • In v1, there may be some information that was useless in v3, but v5 could use it; however, cascading transformations destroyed this data.

  • Older versions tend to take longer to create objects.

Allowance:

  • You only need to write one convert_to() each time you change the Obj class. However, a single error in one of the converters in a row can have more serious consequences and can lead to a failure of database consistency. However, you are more likely to find such an error.

Anxiety:

  • Could it be that conversion-after-conversion is getting too much noise in objects of older versions that they are wrong?

Question:

  • Are there any other templates that do this better?

  • Those of you who have had some experience with my suggestions, what do you think of my concern about the above implementations?

  • What are the preferred solutions?

Thank you very much

+4
source share
2 answers

void Read_Obj (File * f, Obj * o) {
if (f-> version () == File :: Version1) {

if is a hidden switch / case, so to speak. And the switch / case in C ++ is generally interchangeable with polymorphism . Example:

 struct Reader { virtual void Read_Obj( File *f, Obj *o ) = 0; /* methods to read further objects */ } struct ReaderV1 : public Reader { void Read_Obj( File *f, Obj *o ) { /* ... */ }; /* methods to read further objects */ } struct ReaderV2 : public Reader { void Read_Obj( File *f, Obj *o ) { /* ... */ }; /* methods to read further objects */ } 

And then create an instance of the corresponding child Reader after opening the file and determining the version number. This way, you will only have one check of the file version in the top-level code, instead of polluting all the low-level codes with checks.

If the code is distributed between the file version, for convenience you can also put it in the base class of the reader.

I highly recommend the option with class Obj_v1 and class Obj , where the read() method belongs to Obj itself. Thus, one of them easily ends with circular dependencies, as well as a bad idea to make an object aware of its constant representation. IME (in my experience), it's best to have a third-party class hierarchy responsible for this. (As in std::iostream vs. std::string vs. operator << : the stream does not know the line, the line does not know the stream, only opeartor << knows both.)

Otherwise, I personally do not see much difference between your "Strategy 1" and "Strategy 2". They both use convert_to() , which I personally consider superficial. Instead, you should use the IME solution with polymorphism - automatically convert everything to an updated version of the class Obj object without intermediate class Obj_v1 and class Obj_v2 . Since with polymorphism you will have a special reading function for each version, it will be easy to ensure the correct relaxation of objects from the read information.

Are there any other templates that do this better? Those of you who have had some experience with my suggestions, what do you think of my concern about the above implementations? What are the preferred solutions?

This is exactly what polymorphism had to solve and how I usually do such tasks myself.

This is due to the serialization of objects , but I have not seen a single structure of serialization (my information is probably outdated), which was capable of supporting multiple versions of the same class.

I personally completed several times with the following hierarchy of serialization / deserialization classes:

  • abstract reader interface (very thin by definition)
  • utility classes that implement reading and writing of actual objects from / to streams (bold, very reusable code, also used for network transfers).
  • reader interface implementation versions (relatively thin, reusing utility classes)
  • author interface / class (I always wrote an updated version of the file. Versions were used only while reading.)

Hope this helps.

+3
source

You might be able to use the Google protocol buffers.

The main idea that goes beyond protobuf is to decorate the actual serialization from class information, because you are creating a class dedicated to serialization ... but the real benefit lies elsewhere.

The information encoded by protobuf is naturally compatible with both reverse and direct access, so you add information and decode the old file: there will be no new information. On the other hand, if you delete information, it will skip it during decoding.

This means that you will leave the protobuf version processing (without the actual version number actually), and then when changing the class:

  • you will stop receiving information that you no longer need
  • you add new fields for the new data that you have.

It can also help you better think about what to save and in what format, normally convert data before saving (encoding) and transforming it when reading (decoding), so the actual saving format should change less often (you add items, but you don’t need to often refactor already encoded data).

+2
source

All Articles