NLTK could not find the gs file

I am trying to use NLTK, a Stanford natural language toolkit. After installing the necessary files, I run the demo code: http://www.nltk.org/index.html

>>> import nltk >>> sentence = """At eight o'clock on Thursday morning ... Arthur didn't feel very good.""" >>> tokens = nltk.word_tokenize(sentence) >>> tokens ['At', 'eight', "o'clock", 'on', 'Thursday', 'morning', 

"Arthur," "did," "not," "feel," "very," "good," "." )

 >>> tagged = nltk.pos_tag(tokens) >>> tagged[0:6] [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'), 

("Thursday", "NNP"), ("morning", "NN")]

 >>> entities = nltk.chunk.ne_chunk(tagged) >>> entities 

Then I get the message:

 LookupError: =========================================================================== NLTK was unable to find the gs file! Use software specific configuration paramaters or set the PATH environment variable. 

I tried Google, but no one there will say that the gs file is missing.

+13
python nlp nltk
source share
5 answers

I also ran into this error.

gs stands for ghostscript. You get an error because your chunker is trying to use ghostscript to draw a sentence syntax tree, something like this:

enter image description here

I used IPython; To debug the problem, I set the verbose trace length using the %xmode verbose command, which prints the local variables of each frame in the stack. (see full excerpt below). File Names:

file_names=['gs', 'gswin32c.exe', 'gswin64c.exe']

A small google search for gswin32c.exe told me that this is ghostscript.

 /Users/jasonwirth/anaconda/lib/python3.4/site-packages/nltk/__init__.py in find_file_iter(filename='gs', env_vars=['PATH'], searchpath=(), file_names=['gs', 'gswin32c.exe', 'gswin64c.exe'], url=None, verbose=False) 517 (filename, url)) 518 div = '='*75 --> 519 raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div)) 520 521 def find_file(filename, env_vars=(), searchpath=(), LookupError: =========================================================================== NLTK was unable to find the gs file! Use software specific configuration paramaters or set the PATH environment variable. =========================================================================== 
+9
source share

Just add the answer to Jason Virt. On Windows, this line of code will look for "gswin64c.exe" in the PATH environment variable, however the ghostscript installer does not add binary code to PATH, so for this you need to find where ghostscript is installed and add the / bin subfolder to PATH.

For example, in my case, I added C: \ Program Files \ gs \ gs9.19 \ bin to PATH.

+3
source share

Just add to the previous answers, if you replace “entities” with “print (entities)”, you will not get an error.

Without print (), the console / laptop does not know how to "draw" a tree object.

+2
source share

In addition to Alex Kinman, I also get the same error even after installing ghostscript and adding it to the nltk path. Using print () allows you to print objects, and even with this error it seems to me that I can get the output below, but, unfortunately, there is no tree yet.

 Tree('S', [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'NN'), ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'), Tree('PERSON', [('Arthur', 'NNP')]), ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'), ('very', 'RB'), ('good', 'JJ'), ('.', '.')]) 
0
source share

If ghostscript is for some reason unavailable for your platform or cannot be installed, you can also use the wonderful networkx package to render such trees:

 import networkx as nx from networkx.drawing.nx_agraph import graphviz_layout import matplotlib.pyplot as plt def drawNodes(G,nodeLabels,parent,lvl=0): def addNode(G,nodeLabels,label): n = G.number_of_nodes() G.add_node(n) nodeLabels[n] = label return n def findNode(nodeLabels,label): # Travel backwards from end to find right parent for i in reversed(range(len(nodeLabels))): if nodeLabels[i] == label: return i indent = " "*lvl if lvl == 0: addNode(G,nodeLabels,parent.label()) for node in parent: if type(node) == nltk.Tree: n = addNode(G,nodeLabels,node.label()) G.add_edge(findNode(nodeLabels,parent.label()),n) drawNodes(G,nodeLabels,node,lvl+1) else: print node n1 = addNode(G,nodeLabels,node[1]) n0 = addNode(G,nodeLabels,node[0]) G.add_edge(findNode(nodeLabels,parent.label()),n1) G.add_edge(n0,n1) G = nx.Graph() nodeLabels = {} drawNodes(G,nodeLabels,entities) options = { 'node_color': 'white', 'node_size': 100 } plt.figure(1,figsize=(12,6)) pos=graphviz_layout(G, prog='dot') nx.draw(G, pos, font_weight='bold', arrows=False, **options) l = nx.draw_networkx_labels(G,pos,nodeLabels) 

NLTK Token Tree plotted with NetworkX

0
source share

All Articles