eg:
>>> a = {'req_params': {'app': '12345', 'format': 'json'}, 'url_params': {'namespace': 'foo', 'id': 'baar'}, 'url_id': 'rest'} >>> b = {'req_params': { 'format': 'json','app': '12345' }, 'url_params': { 'id': 'baar' ,'namespace':'foo' }, 'url_id': 'REST'.lower() } >>> a == b True
What is a good hash function to create equal hashes for both dicts? Dictionaries will have basic data types such as int, list, dict and string, no other objects.
It would be great if the hash was optimized in space, the target set is about 5 million objects, so the likelihood of a collision is pretty low.
I'm not sure if json.dumps or other serializers respect equality instead of the structure of members in a dictionary. eg. Basic hashing using str dict does not work:
>>> a = {'name':'Jon','class':'nine'} >>> b = {'class':'NINE'.lower(),'name':'Jon'} >>> str(a) "{'name': 'Jon', 'class': 'nine'}" >>> str(b) "{'class': 'nine', 'name': 'Jon'}"
json.dumps doesn't work either:
>>> import json,hashlib >>> a = {'name':'Jon','class':'nine'} >>> b = {'class':'NINE'.lower(),'name':'Jon'} >>> a == b True >>> ha = hashlib.sha256(json.dumps(a)).hexdigest() >>> hb = hashlib.sha256(json.dumps(b)).hexdigest() >>> ha '545af862cc4d2dd1926fe0aa1e34ad5c3e8a319461941b33a47a4de9dbd7b5e3' >>> hb '4c7d8dbbe1f180c7367426d631410a175d47fff329d2494d80a650dde7bed5cb'
python
Dhruvpathak
source share