Generate an object hash sequentially

I am trying to get the hash (md5 or sha) of an object.

I implemented this: http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx

I am using nHibernate to retrieve my POCO from the database.
At the start of GetHash, it is different every time it is selected and moistened from the database. I assume this is expected since the main proxies will change.

Anyway,

Is there a way to get a hash of all the properties of an object sequentially each time?

I played with the idea of ​​using StringBuilder over this.GetType (). GetProperties ..... and creating a hash on this, but does this seem to be inefficient?

As an additional note, this is tracking changes to these objects from one database (RDBMS) to NoSQL storage (hash comparison to see if objects have changed between rdbms and nosql)

+7
source share
3 answers

If you do not override GetHashCode , you simply inherit Object.GetHashCode . Object.GetHashCode basically just returns the memory address of the instance if it is a reference object. Of course, every time an object is loaded, it is likely to be loaded into another piece of memory and, therefore, will result in a different hash code.

Is it debatable that the right thing to do; but what was realized "back that day" so that it does not change now.

If you want something consistent, you need to override GetHashCode and create the code based on the "value" of the object (that is, properties and / or fields). It can be as simple as a distributed merge of the hash codes of all properties / fields. Or it can be as hard as you need. If all you are looking for is something to distinguish between two different objects, then using a unique key on the object may work for you. If you are looking for change tracking using a unique hash key, it probably won't work

I just use all the field hash codes to create a fairly distributed hash code for the parent. For example:

 public override int GetHashCode() { unchecked { int result = (Name != null ? Name.GetHashCode() : 0); result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0); result = (result*397) ^ Age; return result; } } 

Using the prime number 397 is to create a unique number for the value in order to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more information on using prime numbers in hash code calculations.

You could, of course, use reflection to get all the properties to do this, but that would be slower. Alternatively, you can use CodeDOM to dynamically generate code to generate a hash based on the reflection of the properties and cache in this code (i.e., Generate it once and reload it next time). But this, of course, is very difficult and may not be worth the effort.

An MD5 hash or SHA or CRC is usually based on a data block. If you want this, then using the hash code of each property does not make sense. Perhaps serializing data into memory and computing a hash in this way will be more applicable, as Henk describes.

+13
source

If this "hash" is used solely to determine whether entities have changed, the following algorithm may help (NB, it is untested and assumes that the same runtime will be used when generating hashes (otherwise, the dependency on GetHashCode is "simple") types are incorrect)):

 public static byte[] Hash<T>(T entity) { var seen = new HashSet<object>(); var properties = GetAllSimpleProperties(entity, seen); return properties.Select(p => BitConverter.GetBytes(p.GetHashCode()).AsEnumerable()).Aggregate((ag, next) => ag.Concat(next)).ToArray(); } private static IEnumerable<object> GetAllSimpleProperties<T>(T entity, HashSet<object> seen) { foreach (var property in PropertiesOf<T>.All(entity)) { if (property is int || property is long || property is string ...) yield return property; else if (seen.Add(property)) // Handle cyclic references { foreach (var simple in GetAllSimpleProperties(property, seen)) yield return simple; } } } private static class PropertiesOf<T> { private static readonly List<Func<T, dynamic>> Properties = new List<Func<T, dynamic>>(); static PropertiesOf() { foreach (var property in typeof(T).GetProperties()) { var getMethod = property.GetGetMethod(); var function = (Func<T, dynamic>)Delegate.CreateDelegate(typeof(Func<T, dynamic>), getMethod); Properties.Add(function); } } public static IEnumerable<dynamic> All(T entity) { return Properties.Select(p => p(entity)).Where(v => v != null); } } 

Then it could be used as follows:

 var entity1 = LoadEntityFromRdbms(); var entity2 = LoadEntityFromNoSql(); var hash1 = Hash(entity1); var hash2 = Hash(entity2); Assert.IsTrue(hash1.SequenceEqual(hash2)); 
+6
source

GetHashCode () returns Int32 (not MD5).

If you create two objects with all the same property values, they will not have the same hash if you use the base or system GetHashCode ().

String is an object and an exception.

 string s1 = "john"; string s2 = "john"; if (s1 == s2) returns true and will return the same GetHashCode() 

If you want to control the comparison of equality of two objects, you must override GetHash and Equality.

If two objects are the same, then they must also have the same GetHash (). But two objects with the same GetHash () are not necessarily the same. The comparison will check GetHash () first, and if it succeeds there, it will check Equals. Well, there are a few comparisons that go straight to Equals, but you should still redefine both and make sure that two identical objects produce the same GetHash.

I use this to synchronize the client with the server. You can use all properties or you can change any change to the VerID property. The advantage here is the faster GetHashCode () accelerator. In my case, I already reset VerID with any property change.

  public override bool Equals(Object obj) { //Check for null and compare run-time types. if (obj == null || !(obj is FTSdocWord)) return false; FTSdocWord item = (FTSdocWord)obj; return (OjbID == item.ObjID && VerID == item.VerID); } public override int GetHashCode() { return ObjID ^ VerID; } 

I ended up using ObjID so that I can do the following

 if (myClientObj == myServerObj && myClientObj.VerID <> myServerObj.VerID) { // need to synch } 

Object.GetHashCode Method

Two objects with the same property values. Are they equal? Do they produce the same GetHashCode ()?

  personDefault pd1 = new personDefault("John"); personDefault pd2 = new personDefault("John"); System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString()); System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString()); // different GetHashCode if (pd1.Equals(pd2)) // returns false { System.Diagnostics.Debug.WriteLine("pd1 == pd2"); } List<personDefault> personsDefault = new List<personDefault>(); personsDefault.Add(pd1); if (personsDefault.Contains(pd2)) // returns false { System.Diagnostics.Debug.WriteLine("Contains(pd2)"); } personOverRide po1 = new personOverRide("John"); personOverRide po2 = new personOverRide("John"); System.Diagnostics.Debug.WriteLine(po1.GetHashCode().ToString()); System.Diagnostics.Debug.WriteLine(po2.GetHashCode().ToString()); // same hash if (po1.Equals(po2)) // returns true { System.Diagnostics.Debug.WriteLine("po1 == po2"); } List<personOverRide> personsOverRide = new List<personOverRide>(); personsOverRide.Add(po1); if (personsOverRide.Contains(po2)) // returns true { System.Diagnostics.Debug.WriteLine("Contains(p02)"); } } public class personDefault { public string Name { get; private set; } public personDefault(string name) { Name = name; } } public class personOverRide: Object { public string Name { get; private set; } public personOverRide(string name) { Name = name; } public override bool Equals(Object obj) { //Check for null and compare run-time types. if (obj == null || !(obj is personOverRide)) return false; personOverRide item = (personOverRide)obj; return (Name == item.Name); } public override int GetHashCode() { return Name.GetHashCode(); } } 
-one
source

All Articles