Make a list with the most frequent dictionary tuple in which the first

Question

Make a list with the most frequent dictionary tuple in which the first

I am trying to make a list containing the most frequent dictionary tuple that captures the first element. For example: If d is my dictionary:

d = {(Hello, my): 1,(Hello, world):2, (my, name):3, (my,house):1}

I want to get a list like this:

L= [(Hello, world),(my, name)]

So, I try this:

L = [k for k,val in d.iteritems() if val == max(d.values())]

But this gives me the maximum of all sets:

L = [('my', 'name')]

I thought that maybe I need to go through the dictionary and make a new one for every first word of each tuple, and then find the most frequent one and include it in the list, but it's hard for me to translate this into code.

+4

python dictionary list tuples n-gram

cotita May 04 '16 at 21:55

source share

4 answers

from itertools import groupby

# your input data
d = {('Hello', 'my'): 1,('Hello', 'world'):2, ('my', 'name'):3, ('my','house'):1}

key_fu = lambda x: x[0][0]  # first element of first element,
                            # i.e. of ((a,b), c), return a

groups = groupby(sorted(d.iteritems(), key=key_fu), key_fu)
l = [max(g, key=lambda x:x[1])[0] for _, g in groups]

+3

das-g 04 '16 22:42

-, max d, , , . (, ), , - . , , .

0

LeoCella 04 '16 22:11

, . ( ) 3- :

d = [
  ('Hello', 'my', 1),
  ('Hello', 'world', 2), 
  ('my', 'name', 3),
  ('my', 'house', 1)
]

For each word in the first position, you want to find the word in the 2nd position most often. Sort the data according to the first word (any order, only to group them), and then by the score (descending).

d.sort(lambda t1,t2: cmp(t2[2],t1[2]) if (t1[0]==t2[0]) else cmp(t1[0],t2[0]))

Finally, iterating through the resulting array, tracking the last word encountered and adding only when a new word is encountered in the 1st position.

L = []
last_word = ""
for word1, word2, count in d:
   if word1 != last_word:
     L.append((word1,word2))
     last_word = word1

print L

By running this code, you will get [('Hello', 'world'), ('my', 'name')].

0

Sci prog May 04 '16 at 23:12

source share

wim · Accepted Answer · 2016-05-04T22:19:01+0000

This is achieved in O (n) if you simply rewrite the mapping from the first word:

>>> d = {('Hello','my'): 1, ('Hello','world'): 2, ('my','name'): 3, ('my','house'): 1}
>>> d_max = {}
>>> for (first, second), count in d.items():
...     if count >= d_max.get(first, (None, 0))[1]:
...         d_max[first] = (second, count)
...         
>>> d_max
{'Hello': ('world', 2), 'my': ('name', 3)}
>>> output = [(first, second) for (first, (second, count)) in d_max.items()]
>>> output
[('my', 'name'), ('Hello', 'world')]

Make a list with the most frequent dictionary tuple in which the first

More articles: