Ordered dictionary of ordered dictionaries in python

I need a dictionary data structure in which dictionaries are stored, as shown below:

custom = {1: {'a': np.zeros(10), 'b': np.zeros(100)}, 2: {'c': np.zeros(20), 'd': np.zeros(200)}} 

But the problem is that I repeatedly repeat this data structure in my code. Every time I repeat it, I need the iteration order to be respected, because all the elements in this complex data structure are mapped to the 1D array (serialized if you do), and therefore the order is important. I was thinking of writing an ordered dict ordered dict , but I'm not sure if this is the right solution, as it seems that I can choose the wrong data structure. What would be the most suitable solution for my case?

UPDATE

So this is what I came up with:

 class Test(list): def __init__(self, *args, **kwargs): super(Test, self).__init__(*args, **kwargs) for k,v in args[0].items(): self[k] = OrderedDict(v) self.d = -1 self.iterator = iter(self[-1].keys()) self.etype = next(self.iterator) self.idx = 0 def __iter__(self): return self def __next__(self): try: self.idx += 1 return self[self.d][self.etype][self.idx-1] except IndexError: self.etype = next(self.iterator) self.idx = 0 return self[self.d][self.etype][self.idx-1] def __call__(self, d): self.d = -1 - d self.iterator = iter(self[self.d].keys()) self.etype = next(self.iterator) self.idx = 0 return self def main(argv=()): tst = Test(elements) for el in tst: print(el) # loop over a lower dimension for el in tst(-2): print(el) print(tst) return 0 if __name__ == "__main__": sys.exit(main()) 

I can repeat as many times as I want in this ordered structure, and I implemented __call__ so that I can iterate over lower sizes. I do not like the fact that if there is no lower dimension in the list, this does not give me any errors. I also have the feeling that every time I call return self[self.d][self.etype][self.idx-1] , it is less efficient than the original dictionary iteration. It's true? How can I improve this?

+6
source share
4 answers

I think using OrderedDict is the best way. They are built-in and relatively fast:

 custom = OrderedDict([(1, OrderedDict([('a', np.zeros(10)), ('b', np.zeros(100))])), (2, OrderedDict([('c', np.zeros(20)), ('d', np.zeros(200))]))]) 

If you want to simplify the iteration of the contents of your data structure, you can always provide a utility function for this:

 def iter_over_contents(data_structure): for delem in data_structure.values(): for v in delem.values(): for row in v: yield row 

Note that in Python 3.3+, which allows yield from <expression> , you can exclude the last for loop:

 def iter_over_contents(data_structure): for delem in data_structure.values(): for v in delem.values(): yield from v 

With one of them, you can write something like:

 for elem in iter_over_contents(custom): print(elem) 

and hide the difficulty.

While you could define your own class in an attempt to encapsulate this data structure and use something like the iter_over_contents() generator function as the __iter__() method, this approach will probably be slower and won't allow expressions using two indexing levels such as:

 custom[1]['b'] 

which uses nested dictionaries (or OrderedDefaultdict , as shown in my other answer).

+2
source

Here's another alternative that uses OrderedDefaultdict to determine the data tree structure you need. I reuse its definition from another.

To use it, you must make sure that the entries are defined in the order in which you want to access them later.

 class OrderedDefaultdict(OrderedDict): def __init__(self, *args, **kwargs): if not args: self.default_factory = None else: if not (args[0] is None or callable(args[0])): raise TypeError('first argument must be callable or None') self.default_factory = args[0] args = args[1:] super(OrderedDefaultdict, self).__init__(*args, **kwargs) def __missing__ (self, key): if self.default_factory is None: raise KeyError(key) self[key] = default = self.default_factory() return default def __reduce__(self): # optional, for pickle support args = (self.default_factory,) if self.default_factory else () return self.__class__, args, None, None, self.iteritems() Tree = lambda: OrderedDefaultdict(Tree) custom = Tree() custom[1]['a'] = np.zeros(10) custom[1]['b'] = np.zeros(100) custom[2]['c'] = np.zeros(20) custom[2]['d'] = np.zeros(200) 

I'm not sure I understand your next question. If the data structure is limited to two levels, you can use nested for loops to iterate over its elements in the order in which they were defined. For instance:

 for key1, subtree in custom.items(): for key2, elem in subtree.items(): print('custom[{!r}][{!r}]: {}'.format(key1, key2, elem)) 

(In Python 2, you want to use iteritems() instead of items() .)

+2
source

Could you just use the list of dictionaries?

 custom = [{'a': np.zeros(10), 'b': np.zeros(100)}, {'c': np.zeros(20), 'd': np.zeros(200)}] 

This may work if the external dictionary is the only one you need in the correct order. You can still access the internal dictionaries using custom[0] or custom[1] (carefully, now indexing starts at 0 ).

If not all indexes are used, you can do the following:

 custom = [None] * maxLength # maximum dict size you expect custom[1] = {'a': np.zeros(10), 'b': np.zeros(100)} custom[2] = {'c': np.zeros(20), 'd': np.zeros(200)} 
+1
source

You can correct the order of your keys during iteration when they are first sorted:

 for key in sorted(custom.keys()): print(key, custom[key]) 

If you want to reduce sorted() -calls, you might want to keep the keys in an extra list, which will then serve as your iteration order:

 ordered_keys = sorted(custom.keys()) for key in ordered_keys: print(key, custom[key]) 

You should be prepared to go as many iterations over your data structure as you need.

0
source

All Articles