How to check if a class file has been modified before it was serialized?

We have our own serialization process for a large number of C # types. However, the regeneration of all serialization information for all classes / types requires a lot of time, and we planned to optimize the serialization process by calculating the hash of the file and, if they are different, we will generate serialized output, otherwise we will skip it. EDIT: We can store hashes in a dictionary that can be output to a file and reread during processing. This is a real idea.

Our current serialization processor works as follows: we add types that need to be serialized for the repo:

SerializerRepo.Add(typeof(MyType)); //Add type to be serialized to a repo 

And then (maybe elsewhere in the code) the serializer process processes the repo and outputs custom XML, etc.,

 Serializer.WriteXML(SerializerRepo.GetTypes()); 

WriteXML goes through each type and outputs an XML file for each type in a specific place. I need to optimize the WriteXML method to only serialize the class / type if it has changed, otherwise let it be.

This may not be the best way to do this, and is open to refactoring. However, the current problem is how to determine if the definition of the class (or file) that the class / type is in has changed to determine whether XML should be generated or not?

Since there is no inherent relationship between the type and the corresponding class, since the class can be partial, .Net does not have such a mapping from types to class files and vice versa. However, we do not have partial classes. But in our case, we seem to need two (albeit unrelated) pieces of information - a file containing the type / class and the type itself.

Two (possibly suboptimal) ideas so far:

  • Or we will indicate the file name along with the type. But it will not succumb to any refactoring where the file name will be changed.

  • Another solution is to manually read each .cs file and parse for the public class <classname> and map it to each type. This seems like a huge overhead and not sure if this is a reliable way to do this.

These are the only two ideas that I have, but nothing concrete. Suggestions?

+4
source share
1 answer

Separate the generation of XML from saving it to disk.

Keep a dictionary from fully qualified class names to hashes. When you first start the dictionary will start empty.

When it is time to make sure that the class corresponding to XML is updated on disk, generate its XML memory, hash and check the hash against the dictionary. If the class name is not in the dictionary or if its hash does not match the hash in the dictionary, save the generated XML and update the dictionary with a new hash.

After you go through this process with all your types, you will get a complete hash dictionary. Keep this on disk and load it the next time you run this program.

+2
source

All Articles