POS template filter?

I am writing code that iterates through a set of POS tags (generated by pos_tag in NLTK) to search for POS patterns. Corresponding POS tag sets are stored in a list for further processing. Of course, a boilerplate template already exists for such a task, but several initial searches did not give me anything.

Are there any code snippets that can use my POS filtering for me?

Thanks Dave

EDIT: complete solution (using RegexParser and where the messages are any line)

text = nltk.word_tokenize(message)
tags = nltk.pos_tag(text)
grammar = r"""
    RULE_1: {<JJ>+<NNP>*<NN>*}
    """
chunker = nltk.RegexpParser(grammar)
chunked = chunker.parse(tags)
def filter(tree):
    return (tree.node == "RULE_1")
for s in chunked.subtrees(filter):
    print s

Check out http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html and http://www.regular-expressions.info/reference.html for more information on creating rules.

+5
1
+3

All Articles