Checking equivalence of two Massive Python dictionaries

I have a massive python dictionary with over 90,000 entries. For reasons that I won’t fall into, I need to save this dictionary in my database and then recompile the dictionary from database records in a later version.

I am trying to set up a procedure to verify that my repository and recompilation are correct, and that my new dictionary is equivalent to the old. What is the best methodology for testing this.

There are small differences, and I want to find out what they are.

+7
source share
3 answers

The most obvious approach, of course:

if oldDict != newDict: print "**Failure to rebuild, new dictionary is different from the old" 

This should be as fast as possible since the Python mechanism is used for comparison.

UPDATE: It seems you are not after an β€œequal,” but something weaker. I think you need to change your question to make it clear what you think is "equivalent."

+11
source

You can start with something like this and customize it to suit your needs.

 >>> bigd = dict([(x, random.randint(0, 1024)) for x in xrange(90000)]) >>> bigd2 = dict([(x, random.randint(0, 1024)) for x in xrange(90000)]) >>> dif = set(bigd.items()) - set(bigd2.items()) 
+2
source
 >>> d1 = {'a':1,'b':2,'c':3} >>> d2 = {'b':2,'x':2,'a':5} >>> set(d1.iteritems()) - set(d2.iteritems()) # items in d1 not in d2 set([('a', 1), ('c', 3)]) >>> set(d2.iteritems()) - set(d1.iteritems()) # items in d2 not in d1 set([('x', 2), ('a', 5)]) 

Edit Do not vote for this answer. Go to Quick Compare Two Python Dictionaries and add upvote. This is a very complete solution.

+1
source

All Articles