(can't find) python counter counter most_common ()

newbie here. I go through the nltk book and another Python intro book. I stumbled upon most_common () earlier in the nltk book, and although I couldn't get it to work and not find a solution at that time, I created a little function that did the trick in this particular exercise and kept going. Now I need it again, but I don’t think I can get around it as easily (the exercise deals with the most common word lengths in a particular text), plus I know that I will find it again in later examples, and I would like to be able to to follow because, as I said, I am a beginner.

In theory, I would have to do this:

fdist = FreqDist(len(w) for w in text1) fdist.most_common() [(3, 50223), (1, 47933), (4, 42345), (2, 38513) ... 

However, Python tells me the following:

 AttributeError: 'FreqDist' object has no attribute 'most_common' 

I found that most_common () is an attribute of counter objects ( http://docs.python.org/2/library/collections.html ) and ( http://docs.python.org/dev/library/collections#collections .Counter ). I understand that maybe I need to import something (module?), But what I tried to import does not work (not defined or nonexistent messages) or does not contain it - I tried

 import collections 

there is no error, but most_common () is not specified when entering dir (collections) or dir (built-in).

I have both 2.7 and 3.0 installed (windows in most cases, sometimes working on my ubuntu virtual machine). I will continue the search, but I will be very grateful for your contribution. It sounds like a basic one, but I'm learning and can't figure it out, at least for now. Again, thank you very much.

+7
source share
2 answers

nltk.probability.FreqDist not collections.Counter .

Use the items method to get a list of items in sorted order (most often first).

 >>> from nltk.probability import FreqDist >>> dist = FreqDist([1, 2, 1, 2, 1]) >>> dist.items() [(1, 3), (2, 2)] 

Or just use collections.Counter :

 >>> from collections import Counter >>> c = Counter([1, 2, 1, 2, 1]) >>> c.most_common() [(1, 3), (2, 2)] 
+6
source share

Some older versions of nltk do not have the most_common module. This can be verified by typing dir(fdist) .

If it is not found, just upgrade your nltk version with pip as follows:

sudo pip install -U nltk

It should work.

0
source share

All Articles