How can I implement a dict with abstract base classes in Python?

I tried to implement the mapping in Python using the abstract base class MutableMapping, but I had an error creating the instance. How can I make a working version of this dictionary that will emulate the built-in dict class in as many ways as possible, using Abstract base classes ?

 >>> class D(collections.MutableMapping): ... pass ... >>> d = D() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: Can't instantiate abstract class D with abstract methods __delitem__, __getitem__, __iter__, __len__, __setitem__ 

A good answer will demonstrate how to make this work, in particular, without a subclass of dict (a concept that I am familiar with ).

+6
source share
5 answers

How do I implement a dict with abstract base classes?

A good answer will demonstrate how to make this work, particularly without a dict subclass.

The error message appears here: TypeError: Can't instantiate abstract class D with abstract methods __delitem__, __getitem__, __iter__, __len__, __setitem__

It turns out that you need to use them to use the base base class (ABC), MutableMapping .

Implementation

So, I implement a mapping that works like a dict in most cases, which uses the attribute attribute of the object dict to match. (Delegation is not the same as inheritance, so we just delegate the __dict__ instance, we can use any other ad-hoc mapping, but you don't seem to need this part of the implementation. For this in Python 2, because MutableMapping does not have __slots__ in Python 2, so you create __dict__ anyway. In Python 3, you can generally avoid dicts by setting __slots__ .)

 import collections class D(collections.MutableMapping): ''' Mapping that works like both a dict and a mutable object, ie d = D(foo='bar') and d.foo returns 'bar' ''' # ``__init__`` method required to create instance from class. def __init__(self, *args, **kwargs): '''Use the object dict''' self.__dict__.update(*args, **kwargs) # The next five methods are requirements of the ABC. def __setitem__(self, key, value): self.__dict__[key] = value def __getitem__(self, key): return self.__dict__[key] def __delitem__(self, key): del self.__dict__[key] def __iter__(self): return iter(self.__dict__) def __len__(self): return len(self.__dict__) # The final two methods aren't required, but nice for demo purposes: def __str__(self): '''returns simple dict representation of the mapping''' return str(self.__dict__) def __repr__(self): '''echoes class, id, & reproducible representation in the REPL''' return '{}, D({})'.format(super(D, self).__repr__(), self.__dict__) 

Demonstration

And to demonstrate usage:

 >>> d = D((e, i) for i, e in enumerate('abc')) >>> d <__main__.D object at 0x7f75eb242e50>, D({'b': 1, 'c': 2, 'a': 0}) >>> da 0 >>> d.get('b') 1 >>> d.setdefault('d', []).append(3) >>> d.foo = 'bar' >>> print(d) {'b': 1, 'c': 2, 'a': 0, 'foo': 'bar', 'd': [3]} 

And to provide the dict API, the lesson learned is that you can always check collections.MutableMapping :

 >>> isinstance(d, collections.MutableMapping) True >>> isinstance(dict(), collections.MutableMapping) True 

Although a dict will always be an instance of MutableMapping due to registration when importing collections, the opposite is not always true:

 >>> isinstance(d, dict) False >>> isinstance(d, (dict, collections.MutableMapping)) True 

After completing this exercise, it is clear to me that using abstract base classes provides only a guarantee of a standard API for users of this class. In this case, users accepting the MutableMapping object will be guaranteed a standard Python API.

Warning:

The constructor method of the fromkeys class fromkeys not implemented.

 >>> dict.fromkeys('abc') {'b': None, 'c': None, 'a': None} >>> D.fromkeys('abc') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'D' has no attribute 'fromkeys' 

You can mask inline dict methods such as get or setdefault

 >>> d['get'] = 'baz' >>> d.get('get') Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'str' object is not callable 

It's pretty simple to expose again:

 >>> del d['get'] >>> d.get('get', 'Not there, but working') 'Not there, but working' 

But I would not use this code in production.


Demo without dictation, Python 3:

 >>> class MM(MutableMapping): ... __delitem__, __getitem__, __iter__, __len__, __setitem__ = (None,) *5 ... __slots__ = () ... >>> MM().__dict__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'MM' object has no attribute '__dict__' 
+13
source

The fewest

You need to implement in your subclass all the abstract methods that you inherit from MutableMapping

 class D(MutableMapping): def __delitem__(self): ''' Your Implementation for deleting the Item goes here ''' raise NotImplementedError("del needs to be implemented") def __getitem__(self): ''' Your Implementation for subscripting the Item goes here ''' raise NotImplementedError("obj[index] needs to be implemented") def __iter__(self): ''' Your Implementation for iterating the dictionary goes here ''' raise NotImplementedError("Iterating the collection needs to be implemented") def __len__(self): ''' Your Implementation for determing the size goes here ''' raise NotImplementedError("len(obj) determination needs to be implemented") def __setitem__(self): ''' Your Implementation for determing the size goes here ''' raise NotImplementedError("obj[index] = item, needs to be implemented") >>> D() <__main__.D object at 0x0258CD50> 

Besides

You need to provide a data structure for storing your display (hash, AVL, Red Black) and a way to create a dictionary

+1
source

The best way to demonstrate this without actually using a dict anywhere is to probably implement something dead simple, very different from a dict , and not completely useless. Like a fixed size fixed size bytes to the same fixed size bytes . (You can use this for, for example, a routing table - it will be much more compact than dict matching unpacked keys with unpacked values, although, obviously, due to speed and flexibility.)

A hash table is just an array of tuples (hash, key, value) . Since the whole point of this is wrapping the data in, we squeeze it into a struct , that is, we can just use a large bytearray for storage. To mark the slot empty, we set its hash value to 0 , which means that we need to "escape" from any real 0 , turning it into 1 , which is stupid, but easier for the code. For simplicity, we will also use the dumbest possible probe algorithm.

 class FixedHashTable(object): hashsize = 8 def __init__(self, elementsize, size): self.elementsize = elementsize self.size = size self.entrysize = self.hashsize + self.elementsize * 2 self.format = 'q{}s{}s'.format(self.elementsize, self.elementsize) assert struct.calcsize(self.format) == self.entrysize self.zero = b'\0' * self.elementsize self.store = bytearray(struct.pack(self.format, 0, self.zero, self.zero) ) * self.size def hash(self, k): return hash(k) or 1 def stash(self, i, h, k, v): entry = struct.pack(self.format, h, k, v) self.store[i*self.entrysize:(i+1)*self.entrysize] = entry def fetch(self, i): entry = self.store[i*self.entrysize:(i+1)*self.entrysize] return struct.unpack(self.format, entry) def probe(self, keyhash): i = keyhash % self.size while True: h, k, v = self.fetch(i) yield i, h, k, v i = (i + 1) % self.size if i == keyhash % self.size: break 

As the error message says, you need to provide implementations of the abstract methods __delitem__ , __getitem__ , __iter__ , __len__ and __setitem__ . However, it is better to find documents that will tell you that if you implement these five methods (plus any other methods required by the base classes, but as you can see from the table, they are not), you will get all other methods for free. You cannot get the most effective possible implementations of all of them, but we will return to this.

First, let's talk about __len__ . Usually people expect this to be O (1), which means we need to track it independently, updating it as needed. So:

 class FixedDict(collections.abc.MutableMapping): def __init__(self, elementsize, size): self.hashtable = FixedHashTable(elementsize, size) self.len = 0 

Now __getitem__ just checks until it finds the right key or reaches the end:

  def __getitem__(self, key): keyhash = self.hashtable.hash(key) for i, h, k, v in self.hashtable.probe(keyhash): if h and k == key: return v 

And __delitem__ does the same, except that it frees the slot if it is found and updates len .

  def __delitem__(self, key): keyhash = self.hashtable.hash(key) for i, h, k, v in self.hashtable.probe(keyhash): if h and k == key: self.hashtable.stash(i, 0, self.hashtable.zero, self.hashtable.zero) self.len -= 1 return raise KeyError(key) 

__setitem__ little more complicated - if it is found, we must replace the value in the slot; if not, we must fill in the empty slot. And here we have to deal with the fact that the hash table can be complete. And, of course, we have to take care of len :

  def __setitem__(self, key, value): keyhash = self.hashtable.hash(key) for i, h, k, v in self.hashtable.probe(keyhash): if not h or k == key: if not h: self.len += 1 self.hashtable.stash(i, keyhash, key, value) return raise ValueError('hash table full') 

And that leaves __iter__ . As in the case of dict , we do not have a specific order, so we can just iterate over the hash table slots and display all non-empty ones:

 def __iter__(self): return (k for (h, k, v) in self.hashtable.fetch(i) for i in range(self.hashtable.size) if h) 

While we are doing this, we could write __repr__ . Note that we can use the fact that we get items for free:

 def __repr__(self): return '{}({})'.format(type(self).__name__, dict(self.items())) 

However, note that by default, items simply creates an ItemsView(self) , and if you track this through the source, you will have it iterate through self and look at each value. You obviously can do better if performance matters:

 def items(self): pairs = ((k, v) for (h, k, v) in self.hashtable.fetch(i) for i in range(self.hashtable.size) if h) return collections.abc.ItemsView._from_iterable(pairs) 

And similarly for values and possibly other methods.

+1
source

The whole idea of ​​the abstract base class is that it has some members (pure virtual members in C ++ terms) that your code should provide - in C ++ these are Pure Virtual members and other virtual members that you can .

Python differs from C ++ in that all members of all classes are virtual and can be overridden (and you can add members to all classes and instances), but there are some required missing classes in base classes of Basic, this is the equivalent of C ++ pure virtual .

Once you get this, you just have to give absent members the opportunity to instantiate your derived class.

For an example of what you are trying to do, see the accepted answer here , but instead of using the dict in the class, you will have to provide the methods that it provides to itself.

-1
source

With MutableMapping as the base class, you must create these methods in your class __delitem__, __getitem__, __iter__, __len__, __setitem__ : __delitem__, __getitem__, __iter__, __len__, __setitem__ .

To create your own dict class, you can get it from dict:

 >>> class D(dict): ... pass ... >>> d = D() >>> d {} 
-2
source

All Articles