How to implement an efficient bidirectional hash table?

Python dict is a very useful data structure:

 d = {'a': 1, 'b': 2} d['a'] # get 1 

Sometimes you also need to index by value.

 d[1] # get 'a' 

What is the most efficient way to implement this data structure? Any official recommend way to do this?

+63
python hashtable bidirectional
Jul 23 '10 at 13:38
source share
6 answers

Here's a dict bidirectional class, inspired by finding a key by value in a Python dictionary and modified to allow the following 2) and 3).

Note:

  • 1) The bd.inverse back directory bd.inverse automatically updated when the standard dict bd changes.
  • 2) The reverse directory bd.inverse[value] always a list of key , such that bd[key] == value .
  • 3) Unlike the bidict module from https://pypi.python.org/pypi/bidict , here we can have 2 keys of the same value, this is very important.

The code:

 class bidict(dict): def __init__(self, *args, **kwargs): super(bidict, self).__init__(*args, **kwargs) self.inverse = {} for key, value in self.items(): self.inverse.setdefault(value,[]).append(key) def __setitem__(self, key, value): if key in self: self.inverse[self[key]].remove(key) super(bidict, self).__setitem__(key, value) self.inverse.setdefault(value,[]).append(key) def __delitem__(self, key): self.inverse.setdefault(self[key],[]).remove(key) if self[key] in self.inverse and not self.inverse[self[key]]: del self.inverse[self[key]] super(bidict, self).__delitem__(key) 

Usage example:

 bd = bidict({'a': 1, 'b': 2}) print(bd) # {'a': 1, 'b': 2} print(bd.inverse) # {1: ['a'], 2: ['b']} bd['c'] = 1 # Now two keys have the same value (= 1) print(bd) # {'a': 1, 'c': 1, 'b': 2} print(bd.inverse) # {1: ['a', 'c'], 2: ['b']} del bd['c'] print(bd) # {'a': 1, 'b': 2} print(bd.inverse) # {1: ['a'], 2: ['b']} del bd['a'] print(bd) # {'b': 2} print(bd.inverse) # {2: ['b']} bd['b'] = 3 print(bd) # {'b': 3} print(bd.inverse) # {2: [], 3: ['b']} 
+49
Feb 19 '14 at 22:34
source share

You can use the same dict by adding a key pair, the value in the reverse order.

 d = {'a': 1, 'b': 2}
 revd = dict ([reversed (i) for i in d.items ()])
 d.update (revd)
+36
Jul 23 '10 at 13:59 on
source share

A poor person's bidirectional hash table would have to use only two dictionaries (these are already configured data structures).

The index also has a bidict package:

The source for bidict can be found on github:

+32
Jul 23 '10 at 13:41
source share

Below a code snippet implements a reversible (bijective) mapping:

 class BijectionError(Exception): """Must set a unique value in a BijectiveMap.""" def __init__(self, value): self.value = value msg = 'The value "{}" is already in the mapping.' super().__init__(msg.format(value)) class BijectiveMap(dict): """Invertible map.""" def __init__(self, inverse=None): if inverse is None: inverse = self.__class__(inverse=self) self.inverse = inverse def __setitem__(self, key, value): if value in self.inverse: raise BijectionError(value) self.inverse._set_item(value, key) self._set_item(key, value) def __delitem__(self, key): self.inverse._del_item(self[key]) self._del_item(key) def _del_item(self, key): super().__delitem__(key) def _set_item(self, key, value): super().__setitem__(key, value) 

The advantage of this implementation is that the inverse attribute for BijectiveMap again BijectiveMap . Therefore, you can do things like:

 >>> foo = BijectiveMap() >>> foo['steve'] = 42 >>> foo.inverse {42: 'steve'} >>> foo.inverse.inverse {'steve': 42} >>> foo.inverse.inverse is foo True 
+2
Dec 25 '15 at 5:12
source share

Something like this might be:

 import itertools class BidirDict(dict): def __init__(self, iterable=(), **kwargs): self.update(iterable, **kwargs) def update(self, iterable=(), **kwargs): if hasattr(iterable, 'iteritems'): iterable = iterable.iteritems() for (key, value) in itertools.chain(iterable, kwargs.iteritems()): self[key] = value def __setitem__(self, key, value): if key in self: del self[key] if value in self: del self[value] dict.__setitem__(self, key, value) dict.__setitem__(self, value, key) def __delitem__(self, key): value = self[key] dict.__delitem__(self, key) dict.__delitem__(self, value) def __repr__(self): return '%s(%s)' % (type(self).__name__, dict.__repr__(self)) 

You need to decide what you want if more than one key has a given value; the bi-directionality of a given pair can be easily tailored by any later pair that you inserted. I implemented one of the possible options.




Example:

 bd = BidirDict({'a': 'myvalue1', 'b': 'myvalue2', 'c': 'myvalue2'}) print bd['myvalue1'] # a print bd['myvalue2'] # b 
+1
Jul 23 '10 at 13:58
source share

First, you need to make sure that the key to match the values ​​is one to one, otherwise it is impossible to build a bidirectional map.

Secondly, how large is the data set? If there is not much data, just use 2 separate cards and update them both when updating. Or better, use an existing solution, for example Bidict , which is just a shell of 2 dictations, with built-in update / delete.

But if the dataset is large and 2 dicts support is undesirable:

  • If both the key and value are numeric, consider using Interpolation to approximate the mapping. If the vast majority of key-value pairs can be covered by the display function (and its inverse function), then you only need to record the outliers on the cards.

  • If most of the access is unidirectional (key-> value), then it is completely ok to build a reverse map gradually to trade time for
    space.

the code:

 d = {1: "one", 2: "two" } reverse = {} def get_key_by_value(v): if v not in reverse: for _k, _v in d.items(): if _v == v: reverse[_v] = _k break return reverse[v] 
0
Dec 25 '15 at 7:42
source share



All Articles