Python - Fastest way to find average over dict with every change?

I am trying to find the fastest / most efficient way to extract the average from a dict. The task I'm working on requires her to do this thousands of times, so simply repeating all the values โ€‹โ€‹in the dict each time I find the average would be completely ineffective. Hundreds and hundreds of new keys, pairs of values โ€‹โ€‹are added to the dict, and we need to find the average value each time. We also need to find a new average value each time the value is updated, which happens thousands of times.

Thanks in advance - this is such a wonderful place.

+4
source share
3 answers

Create your own subclass of dict, which tracks the quantity and total, and then can quickly return the average:

class AvgDict(dict): def __init__(self): self._total = 0.0 self._count = 0 def __setitem__(self, k, v): if k in self: self._total -= self[k] self._count -= 1 dict.__setitem__(self, k, v) self._total += v self._count += 1 def __delitem__(self, k): v = self[k] dict.__delitem__(self, k) self._total -= v self._count -= 1 def average(self): if self._count: return self._total/self._count a = AvgDict() assert a.average() is None a[1] = 1 assert a.average() == 1 a[2] = 10 assert a.average() == 5.5 assert a[2] == 10 a[1] = 5 assert a.average() == 7.5 del a[1] assert a.average() == 10 
+11
source

The average is below, so if you know the previous average:

 At = (A0 * N + E) / (N + 1) At is the average after addition of the new element A0 is the average before addition of the new element N is the number of element before addition of the new element E is the new element value 

Its simple brother works if you keep the tab of the sum of elements:

 At = (T + E) / (N + 1) T is the total of all elements A0 is the average before addition of the new element N is the number of element before addition of the new element E is the new element value 

When the value is deleted, you can do a similar thing:

 At = (A0 * N - E) / (N - 1) 

And when the value is updated:

 At = (A0 * N - E0 + E1) / (N) E0 is value before updating, E1 is value after updating. 
+2
source

Inherit from dict and calculate the average value each time __setitem__ is called.

Since you can save the previous average value in your dictionary class and only average this and the added new value, this should be pretty fast - when you first add a new element, the average value is just the value of this value.

+1
source

All Articles