Using StanfordParser to get typed dependencies on a parsed sentence

Using the NLTK StanfordParser, I can parse the sentence as follows:

os.environ['STANFORD_PARSER'] = 'C:\jars' os.environ['STANFORD_MODELS'] = 'C:\jars' os.environ['JAVAHOME'] ='C:\ProgramData\Oracle\Java\javapath' parser = stanford.StanfordParser(model_path="C:\jars\englishPCFG.ser.gz") sentences = parser.parse(("bring me a red ball",)) for sentence in sentences: sentence 

Result:

 Tree('ROOT', [Tree('S', [Tree('VP', [Tree('VB', ['Bring']), Tree('NP', [Tree('DT', ['a']), Tree('NN', ['red'])]), Tree('NP', [Tree('NN', ['ball'])])]), Tree('.', ['.'])])]) 

How can I use the Stanford parser to get typed dependencies in addition to the above chart? Something like:

  • root (ROOT-0, bring-1)
  • iobj (bring-1, me-2)
  • det (ball-5, a-3)
  • amod (ball-5, red-4)
  • dobj (bring-1, ball-5)
+7
python parsing nlp nltk stanford-nlp
source share
1 answer

The NLTK StanfordParser module does not (currently) transfer the tree to the Stanford Dependencies conversion code. You can use my PyStanfordDependencies library, which carries the dependency converter.

If nltk_tree is sentence from the question code snippet, then this works:

 #!/usr/bin/python3 import StanfordDependencies # Use str() to convert the NLTK tree to Penn Treebank format penn_treebank_tree = str(nltk_tree) sd = StanfordDependencies.get_instance(jar_filename='point to Stanford Parser JAR file') converted_tree = sd.convert_tree(penn_treebank_tree) # Print Typed Dependencies for node in converted_tree: print('{}({}-{},{}-{})'.format( node.deprel, converted_tree[node.head - 1].form if node.head != 0 else 'ROOT', node.head, node.form, node.index)) 
+5
source share

All Articles