English grammar for parsing in NLTK

Is there a ready-made English grammar that I can simply download and use in NLTK? I searched for parsing examples using NLTK, but it seems to me that I have to manually specify the grammar before parsing the sentence.

Thank you so much!

+56
python nlp nltk grammar
May 24 '11 at 19:17
source share
7 answers

You can look at pyStatParser , a simple statistical python parser that returns NLTK parsing trees. It comes with public tree structures, and it only generates a grammar model the first time you create a Parser object (after about 8 seconds). It uses the CKY algorithm and analyzes medium-length expressions (like below) per second.

>>> from stat_parser import Parser >>> parser = Parser() >>> print parser.parse("How can the net amount of entropy of the universe be massively decreased?") (SBARQ (WHADVP (WRB how)) (SQ (MD can) (NP (NP (DT the) (JJ net) (NN amount)) (PP (IN of) (NP (NP (NNS entropy)) (PP (IN of) (NP (DT the) (NN universe)))))) (VP (VB be) (ADJP (RB massively) (VBN decreased)))) (. ?)) 
+31
Jul 29 '13 at 22:52
source share

My library, spaCy , provides a high-performance parser.

Installation:

 pip install spacy python -m spacy.en.download all 

Using:

 from spacy.en import English nlp = English() doc = nlp(u'A whole document.\nNo preprocessing require. Robust to arbitrary formating.') for sent in doc: for token in sent: if token.is_alpha: print token.orth_, token.tag_, token.head.lemma_ 

Choi et al. (2015) found that spaCy is the fastest dependency parser. It processes over 13,000 sentences per second, in a single thread. WSJ estimates that it is 92.7%, more than 1% more accurate than any CoreNLP model.

+20
08 Sep '15 at 20:25
source share

There is a library called Pattern . It is pretty fast and easy to use.

 >>> from pattern.en import parse >>> >>> s = 'The mobile web is more important than mobile apps.' >>> s = parse(s, relations=True, lemmata=True) >>> print s 'The/DT/B-NP/O/NP-SBJ-1/the mobile/JJ/I-NP/O/NP-SBJ-1/mobile' ... 
+7
Jul 25
source share

nltk_data has several grammars. In the Python interpreter, enter nltk.download() .

+5
May 24 '11 at 19:25
source share

Use MaltParser, there you have pre-prepared English grammar, as well as some other pre-prepared languages. And Maltparser is a dependency analyzer, not some simple Parser from bottom to top or top to bottom.

Just download MaltParser from http://www.maltparser.org/index.html and use NLTK as follows:

 import nltk parser = nltk.parse.malt.MaltParser() 
+4
Aug 08 2018-12-12T00:
source share

I tried NLTK, PyStatParser, Pattern. IMHO Pattern is the best English parser presented in this article. Because it supports pip installation and there is a fancy document on the website ( http://www.clips.ua.ac.be/pages/pattern-en ). I could not find a reasonable document for NLTK (and this gave me an inaccurate result for me by default. And I could not find how to configure it). pyStatParser is much slower than described above in my environment. (It took about one minute to initialize and it took a couple of seconds to parse long sentences. Maybe I did not use it correctly).

+4
Nov 10 '14 at 23:02
source share

Have you tried tagging POS in NLTK?

 text = word_tokenize("And now for something completely different") nltk.pos_tag(text) 

The answer is:

 [('And', 'CC'), ('now', 'RB'), ('for', 'IN'), ('something', 'NN'),('completely', 'RB'), ('different', 'JJ')] 

Here is an example here NLTK_chapter03

+3
Oct 24 '17 at 18:14
source share



All Articles