Use NLTK without installation

Learning Python with the Natural Language Toolkit was very fun, and they work great on my local machine, although I had to install several packages to use it. Exactly how NLTK resources are now integrated into my system remains a mystery to me, although it seems clear that the NLTK source code does not just sit somewhere where the Python interpreter knows to find it.

I would like to use the Toolkit on my website, which is hosted by another company. Just downloading the NLTK source files to my server and pointing the scripts in the root directory to "import nltk" did not work; I kind of doubted it.

What is the difference between what the NLTK installation procedure does and direct import, and why should the tool be unavailable for simple import? Is there a way to use the NLTK source files without significantly changing my Python host?

Thank you very much for your thoughts and notes. -G

+4
source share
3 answers

Suppose you have an NLTK source located in /some/dir/ , so

 dhg /some/dir/$ ls nltk ... app book.py ccg chat chunk classify ... 

You can start the python interpreter from the directory in which the nltk source directory is nltk :

 dhg /some/dir/$ python Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34) >>> import nltk 

Or you can add its location to the PYTHONPATH environment variable, which makes NLTK available from anywhere:

 dhg /whatever/$ export PYTHONPATH="$PYTHONPATH:/some/dir/" dhg /whatever/$ python Python 2.7.1 (r271:86882M, Nov 30 2010, 10:35:34) >>> import nltk 

Any other dependencies, including those dependent on NLTK, can also be added to PYTHONPATH in the same way.

+1
source

You not only need NLTK on PYTHONPATH (as @dhg points out ), you need all the dependencies that it has; A quick local test shows that this is really only PyYAML . You should just use pip to install packages. This is much less error prone than trying to manually determine all the dependencies and configure PYTHONPATH accordingly. If this is a shared host where you do not have proper access to run the pip installation, you should ask the host to do this for you.

To address the more general β€œNo matter what part of the installation script does” part of your question: most Python packages are managed using setup.py , which is built on top of distutils (and sometimes setuputils ). If this really interests you, check out Hitchhiking Packaging .

+1
source

You do not need system installation support, namely the right modules, where python can find them. I installed NLTK without permissions to install the system with relatively minor problems - but I had access to the command line, so I could see what I was doing.

In order to get this to work, you must build a local installation on the computer you are running, ideally that NLTK has never been installed, as you may have forgotten (or do not know) what was configured for you. Once you find out what you need, copy the package to the hosting computer. But at this point, make sure that you are using module versions that are appropriate for the web server architecture. In particular, Numpy has different versions of 32/64 bits, IIRC.

It is also worth considering how to see error messages from the host computer. If you do not see them by default, you can catch ImportError and display the message contained, or you can redirect stderr ... it depends on your configuration.

+1
source

All Articles