NLTK - how to find out which packages are installed from python?

Question

NLTK - how to find out which packages are installed from python?

I am trying to download some of the packages that I installed with the NLTK installer, but I got:

>>> from nltk.corpus import machado Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name machado

But in the download manager ( nltk.download() ), the machado package is marked as installed, and I have the nltk_data/corpus/machado .

How can I see inside embedded python what installed shells are?

Also, what package should I install to work with this guide? http://nltk.googlecode.com/svn/trunk/doc/howto/portuguese_en.html

I can not find the nltk.examples module that is referenced in the instructions.

+7

python nlp nltk corpus

Rafael S. Calsaverini Dec 14 '09 at 19:32

source share

2 answers

NLTK includes the nltk.corpus package, which contains the definitions of case readers (for example, PlainTextCorpusReader ). This package also includes a large list of predefined access points for enclosures that can be downloaded using nltk.downloader() . These access points (for example, nltk.corpus.brown ) are determined regardless of whether the corresponding case is loaded.

To see which access points are defined in NLTK, use dir(nltk.corpus) (after import nltk ).
To see which one you have , try the following:
```
 import os import nltk print( os.listdir( nltk.data.find("corpora") ) ) 
```
It just dumps the list with the contents of the nltk_data/corpora folder. You can take it from there.
If you installed your own enclosure in the area of nltk_data/corpora , and NLTK does not know about it, you need to start the corresponding reader yourself. For example, if this is text content in corpora/mycorpus and all files end in .txt , you should do it like this:
```
 import nltk from nltk.corpus import PlaintextCorpusReader mypath = nltk.data.find("corpora/mycorpus") mycorpus = PlaintextCorpusReader(mypath, r".*\.txt$") 
```
But in this case, you can place your own enclosure anywhere and point mypath to it directly, instead of asking NLTK to find it.

+3

alexis Nov 19 '13 at 15:31

source share

Hank gay · Accepted Answer · 2009-12-14T19:39:49+0000

to try

 import nltk.corpus dir(nltk.corpus)

at that moment he probably told you something about __LazyModule__... , so dir(nltk.corpus) again.

If this does not work, try running a tab in iPython.

NLTK - how to find out which packages are installed from python?

More articles: