Using StanfordParser to get typed dependencies on a parsed sentence

Question

Using StanfordParser to get typed dependencies on a parsed sentence

Using the NLTK StanfordParser, I can parse the sentence as follows:

os.environ['STANFORD_PARSER'] = 'C:\jars' os.environ['STANFORD_MODELS'] = 'C:\jars' os.environ['JAVAHOME'] ='C:\ProgramData\Oracle\Java\javapath' parser = stanford.StanfordParser(model_path="C:\jars\englishPCFG.ser.gz") sentences = parser.parse(("bring me a red ball",)) for sentence in sentences: sentence

Result:

 Tree('ROOT', [Tree('S', [Tree('VP', [Tree('VB', ['Bring']), Tree('NP', [Tree('DT', ['a']), Tree('NN', ['red'])]), Tree('NP', [Tree('NN', ['ball'])])]), Tree('.', ['.'])])])

How can I use the Stanford parser to get typed dependencies in addition to the above chart? Something like:

root (ROOT-0, bring-1)
iobj (bring-1, me-2)
det (ball-5, a-3)
amod (ball-5, red-4)
dobj (bring-1, ball-5)

+7

python parsing nlp nltk stanford-nlp

Yarik Apr 13 '15 at 10:24

source share

1 answer

dmcc · Answer 1 · 2015-04-13T20:23:17+0000

The NLTK StanfordParser module does not (currently) transfer the tree to the Stanford Dependencies conversion code. You can use my PyStanfordDependencies library, which carries the dependency converter.

If nltk_tree is sentence from the question code snippet, then this works:

 #!/usr/bin/python3 import StanfordDependencies # Use str() to convert the NLTK tree to Penn Treebank format penn_treebank_tree = str(nltk_tree) sd = StanfordDependencies.get_instance(jar_filename='point to Stanford Parser JAR file') converted_tree = sd.convert_tree(penn_treebank_tree) # Print Typed Dependencies for node in converted_tree: print('{}({}-{},{}-{})'.format( node.deprel, converted_tree[node.head - 1].form if node.head != 0 else 'ROOT', node.head, node.form, node.index))

Using StanfordParser to get typed dependencies on a parsed sentence

More articles: