Finding the best k biggest keys in a python dictionary

Say I have a dictionary:

{key1:value1........... keyn:valuen} 

So let's say I want to write a function

 def return_top_k(dictionary, k): return list_of_keys_sorted 

What is the most efficient way (from the point of view of large O) to get keys that have upper values โ€‹โ€‹of k (maintaining order, i.e. the oldest key of value is present at the beginning .., etc.)

+6
source share
5 answers

O(n log k) :

 import heapq k_keys_sorted = heapq.nlargest(k, dictionary) 

You can use the keyword keyword parameter to specify what should be used as the sort key, for example:

 k_keys_sorted_by_values = heapq.nlargest(k, dictionary, key=dictionary.get) 
+14
source
 return sorted(dictionary, key=dictionary.get, reverse=True)[:10] 

In the worst case, O(NlogN) (although the heapq suggested by others is probably better) ...

It might make sense to use Counter instead of a regular dictionary. In this case, the most_common method will do (approximately) what you want ( dictionary.most_common(10) ), but only if it makes sense to use Counter in your API.

+4
source

For the top 3 step by step:

 >>> from operator import itemgetter >>> dct = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5} >>> sorted(dct.items(), key=itemgetter(1), reverse=True) [('e', 5), ('d', 4), ('c', 3), ('b', 2), ('a', 1)] >>> map(itemgetter(0), sorted(dct.items(), key=itemgetter(1), reverse=True)) ['e', 'd', 'c', 'b', 'a'] >>> map(itemgetter(0), sorted(dct.items(), key=itemgetter(1), reverse=True))[:3] ['e', 'd', 'c'] 

Or using the heapq module

 >>> import heapq >>> heapq.nlargest(3, dct.items(), key=itemgetter(1)) [('e', 5), ('d', 4), ('c', 3)] >>> map(itemgetter(0), _) ['e', 'd', 'c'] 
+1
source

In code

 dct = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5} k = 3 print sorted(dct.keys(), reverse=True)[:k] 

If you also need values:

 print sorted(dct.items(), reverse=True)[:k] 

Or if you want to use OrderedDict :

 from collections import OrderedDict d = OrderedDict(sorted(dct.items(), reverse=True)) print d.keys()[:k] 
+1
source
 portfolio = [ {'name': 'IBM', 'shares': 100, 'price': 91.1}, {'name': 'AAPL', 'shares': 50, 'price': 543.22}, {'name': 'FB', 'shares': 200, 'price': 21.09}, {'name': 'HPQ', 'shares': 35, 'price': 31.75}, {'name': 'YHOO', 'shares': 45, 'price': 16.35}, {'name': 'ACME', 'shares': 75, 'price': 115.65} ] cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price']) expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price']) 
+1
source

Source: https://habr.com/ru/post/924556/


All Articles