Directly create MrTopf efforts:
import re rx = re.compile("((?:@\w+ +)+)(.*)") t='@abc @def @xyz Hello this part is text and my email is foo@ba.r ' a,s = rx.match(t).groups() l = re.split('[@ ]+',a)[1:-1] print l print s
prints:
['abc', 'def', 'xyz']
Hi, this part is the text and my email address is: foo@ba.r
Just by calling to the hasen j account, let me explain how this works:
/@\w+ +/
matches one tag - @ followed by at least one alphanumeric or _ followed by at least one whitespace character. + greedy, so if there is more than one place, he will capture them all.
To match any number of these tags, we need to add a plus (one or more things) to the template for the tag; so we need to group it with parentheses:
/(@\w+ +)+/
which matches one or more tags and, being greedy, matches all of them. However, these parentheses now work with our capture groups, so we undo this by turning them into an anonymous group:
/(?:@\w+ +)+/
Finally, we do this in the capture group and add another to raise the rest:
/((?:@\w+ +)+)(.*)/
Last breakdown to take stock:
((?:@\w+ +)+)(.*) (?:@\w+ +)+ ( @\w+ +) @\w+ +
Note that when considering this, I improved it - \ w did not need to be in the set, and now it allows you to use several spaces between tags. Thank hasen-j
Brent.Longborough
source share