Finding the best k biggest keys in a python dictionary

Question

Finding the best k biggest keys in a python dictionary

Say I have a dictionary:

{key1:value1........... keyn:valuen}

So let's say I want to write a function

 def return_top_k(dictionary, k): return list_of_keys_sorted

What is the most efficient way (from the point of view of large O) to get keys that have upper values of k (maintaining order, i.e. the oldest key of value is present at the beginning .., etc.)

+6

python

Fraz Sep 04 '12 at 15:22

source share

5 answers

 return sorted(dictionary, key=dictionary.get, reverse=True)[:10]

In the worst case, O(NlogN) (although the heapq suggested by others is probably better) ...

It might make sense to use Counter instead of a regular dictionary. In this case, the most_common method will do (approximately) what you want ( dictionary.most_common(10) ), but only if it makes sense to use Counter in your API.

+4

mgilson Sep 04 '12 at 15:24

source share

For the top 3 step by step:

 >>> from operator import itemgetter >>> dct = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5} >>> sorted(dct.items(), key=itemgetter(1), reverse=True) [('e', 5), ('d', 4), ('c', 3), ('b', 2), ('a', 1)] >>> map(itemgetter(0), sorted(dct.items(), key=itemgetter(1), reverse=True)) ['e', 'd', 'c', 'b', 'a'] >>> map(itemgetter(0), sorted(dct.items(), key=itemgetter(1), reverse=True))[:3] ['e', 'd', 'c']

Or using the heapq module

 >>> import heapq >>> heapq.nlargest(3, dct.items(), key=itemgetter(1)) [('e', 5), ('d', 4), ('c', 3)] >>> map(itemgetter(0), _) ['e', 'd', 'c']

+1

Alexey Kachayev Sep 04 '12 at 15:28

source share

In code

 dct = {"a": 1, "b": 2, "c": 3, "d": 4, "e": 5} k = 3 print sorted(dct.keys(), reverse=True)[:k]

If you also need values:

 print sorted(dct.items(), reverse=True)[:k]

Or if you want to use OrderedDict :

 from collections import OrderedDict d = OrderedDict(sorted(dct.items(), reverse=True)) print d.keys()[:k]

+1

Saksham varma Apr 14 '15 at 1:02

source share

 portfolio = [ {'name': 'IBM', 'shares': 100, 'price': 91.1}, {'name': 'AAPL', 'shares': 50, 'price': 543.22}, {'name': 'FB', 'shares': 200, 'price': 21.09}, {'name': 'HPQ', 'shares': 35, 'price': 31.75}, {'name': 'YHOO', 'shares': 45, 'price': 16.35}, {'name': 'ACME', 'shares': 75, 'price': 115.65} ] cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price']) expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

+1

Bala May 27 '16 at 8:33

source share

jfs · Accepted Answer · 2012-09-04T15:27:09+0000

O(n log k) :

 import heapq k_keys_sorted = heapq.nlargest(k, dictionary)

You can use the keyword keyword parameter to specify what should be used as the sort key, for example:

 k_keys_sorted_by_values = heapq.nlargest(k, dictionary, key=dictionary.get)

Finding the best k biggest keys in a python dictionary

More articles: