Regular expression in Python sentence extractor

Question

Regular expression in Python sentence extractor

I have a script that gives me suggestions containing one of the specified keywords. A sentence is defined as something between two periods.

Now I want to use it to select the whole sentence like “Put 1.5 grams of powder”, where, if the powder was the keyword, he would get the whole sentence, and not “5 grams of powder”

I am trying to figure out how to express that sentence is between sequences of period, then space. My new filter:

def iterphrases(text):
    return ifilter(None, imap(lambda m: m.group(1), finditer(r'([^\.\s]+)', text)))

However, now I no longer print any sentences of only pieces / phrases of words (including my keyword). I am very confused by what I am doing wrong.

+4

python regex

Jacob ian Jan 14 '15 at 15:38

source share

3

. . , ( ) , , .

import re
sentence = re.compile("\w.*?\.(?= |$)", re.MULTILINE)
def iterphrases(text):
    return (match.group(0) for match in sentence.finditer(text))

+2

L3viathan 14 . '15 16:27

If you are sure that you are .not using anything other than sentence separators, and that each relevant sentence ends for a period, then the following may be useful:

matches = re.finditer('([^.]*?(powder|keyword2|keyword3).*?)\.', text)
result = [m.group() for m in matches]

0

Alex Jan 14 '15 at 16:45

source share

Aprillion · Accepted Answer · 2015-01-14T16:21:02+0000

, re.split ( ):

re.split(r'\.\s', text)

, . ( text ), :

re.split(r'\.\s', re.sub(r'\.\s*$', '', text))

Python - RegEx (-)

, nltk.tokenize

nltk.tokenize.sent_tokenize(text)

Regular expression in Python sentence extractor

More articles: