How to install english.pickle for nltk on a standalone Linux machine

I am trying to run nltk in a SUSE Linux box that cannot be connected to the Internet.

I have successfully installed nltk and it starts, but when I submit

>>> tagged = nltk.pos_tag(tokens) 

I get this error:

LookupError:
**********************************************.... *********************
Resource 'tokenizers / punkt / english.pickle' not found. Please use NLTK Downloader to get the resource:

I cannot use the bootloader since I cannot connect the window to the Internet.

Anyone how can I install the necessary packages?

+7
source share
3 answers

Data is uploaded to the nltk_data directory. If this differs from one system to another, but you can find out by doing the following:

 import nltk print nltk.data.find('.') 

english.pickle should be in the subfolder <nltk_data>/taggers/ . The easiest way to add it is to use the bootloader on a computer with Internet access, then copy it and place it in the same subfolder. There is only one version of english.pickle , and you can download it in the Windows window, without any problems.

+3
source

The upload file stores the files in a specific folder. I assume that you can download to an online machine and copy the files to an equivalent location on a standalone computer. On my machine, it boots up to /usr/local/lib/nltk_data .

+2
source

For reference purposes (as of 2017), punkt tokenizers are located at this link on GitHub:

https://github.com/nltk/nltk_data/blob/gh-pages/packages/tokenizers/punkt.zip

You can download from an unprocessed machine and move it to a USB flash drive.

+1
source

All Articles