Why does overriding __contains__ break OrderedDict.keys?

I will subclass OrderedDict (Cpython, 2.7.3) to represent a data file. __getitem__ pulls a field from the data file and installs it in the current instance, similar to the code below. now I would like to override __contains__ to return True if this field is in a dictionary or in a file on disk, as it can be read anyway. However, this seems to violate OrderedDict ability OrderedDict validate keys.

 from collections import OrderedDict dictclass = OrderedDict class Foo(dictclass): def __getitem__(self,key): try: return dictclass.__getitem__(self,key) except KeyError: pass data = key*2 self[key] = data return data def __contains__(self,whatever): return dictclass.__contains__(self,whatever) or 'bar' in whatever a = Foo() print a['bar'] print a.keys() 

If you run the code above, you will get this output:

 barbar [] 

Note that if you change dictclass = dict in the above code, it still works (giving the following output).

 barbar ['bar'] 

Am I doing something terribly wrong?

+6
source share
2 answers

If Foo.__contains__ is undefined :

 a['bar'] 

calls Foo.__getitem__ , which executes

  self[key] = data 

This calls OrderedDict.__setitem__ , which is defined as follows:

 def __setitem__(self, key, value, PREV=0, NEXT=1, dict_setitem=dict.__setitem__): 'od.__setitem__(i, y) <==> od[i]=y' # Setting a new item creates a new link at the end of the linked list, # and the inherited dictionary is updated with the new key/value pair. if key not in self: root = self.__root last = root[PREV] last[NEXT] = root[PREV] = self.__map[key] = [last, root, key] dict_setitem(self, key, value) 

Since Foo.__contains__ is undefined,

  if key not in self: 

- it's true. Thus, the key is correctly added to the self.__root and self.__map .

When Foo.__contains__ is determined ,

  if key not in self: 

if false. Thus, the key is not appended properly to self.__root and self.__map . Foo.__contains__ Effective fools OrderedDict.__setitem__ in the thought that the key 'bar' has already been added.


I found it useful to play with the following code (adding print statements to __setitem__ and __iter__ ):

 from collections import OrderedDict dictclass = OrderedDict class Foo(dictclass): def __getitem__(self,key): try: return dictclass.__getitem__(self,key) except KeyError: pass data = key*2 self[key] = data return data def __contains__(self,whatever): print('contains: {}'.format(whatever)) return dictclass.__contains__(self,whatever) or 'bar' in whatever def __setitem__(self, key, value, PREV=0, NEXT=1, dict_setitem=dict.__setitem__): 'od.__setitem__(i, y) <==> od[i]=y' # Setting a new item creates a new link at the end of the linked list, # and the inherited dictionary is updated with the new key/value pair. print('key not in self: {}'.format(key not in self)) if key not in self: root = self._OrderedDict__root last = root[PREV] last[NEXT] = root[PREV] = self._OrderedDict__map[key] = [last, root, key] dict_setitem(self, key, value) def __iter__(self): 'od.__iter__() <==> iter(od)' # Traverse the linked list in order. NEXT, KEY = 1, 2 root = self._OrderedDict__root curr = root[NEXT] print('curr: {}'.format(curr)) print('root: {}'.format(root)) print('curr is not root: {}'.format(curr is not root)) while curr is not root: yield curr[KEY] curr = curr[NEXT] a = Foo() print a['bar'] # barbar print a.keys() # ['bar'] 

Note that you can avoid this problem by making Foo subclass of collections.MutableMapping and delegating most of your behavior to the OrderedDict attribute:

 import collections dictclass = collections.OrderedDict class Foo(collections.MutableMapping): def __init__(self, *args, **kwargs): self._data = dictclass(*args, **kwargs) def __setitem__(self, key, value): self._data[key] = value def __delitem__(self, key): del self._data[key] def __iter__(self): return iter(self._data) def __len__(self): return len(self._data) def __getitem__(self,key): try: return self._data[key] except KeyError: pass data = key*2 self[key] = data return data def __contains__(self,whatever): return dictclass.__contains__(self,whatever) or 'bar' in whatever 

what gives

 a = Foo() print a['bar'] # barbar print a.keys() # ['bar'] 

even with __contains__ .

+6
source

What violates your code is or 'bar' in whatever . If you delete it, it will work the same as changing dictclass = dict .

The implementation of __setitem__ OrderedDict is as follows:

 def __setitem__(self, key, value, dict_setitem=dict.__setitem__): 'od.__setitem__(i, y) <==> od[i]=y' # Setting a new item creates a new link at the end of the linked list, # and the inherited dictionary is updated with the new key/value pair. if key not in self: root = self.__root last = root[0] last[1] = root[0] = self.__map[key] = [last, root, key] return dict_setitem(self, key, value) 

So, with self["bar"] = "barbar" condition should be False, but it is True even before the insertion of any element. Thus, the key is not added to self.__root , which is used in OrderedDict.__iter__ :

 def __iter__(self): 'od.__iter__() <==> iter(od)' # Traverse the linked list in order. root = self.__root curr = root[1] # start at the first node while curr is not root: yield curr[2] # yield the curr[KEY] curr = curr[1] # move to next node 

Since the code uses this iterator to extract values, and self.__root does not contain "bar" , this particular key cannot be returned to values.

+2
source

All Articles