Python regex: multiple matches on the same line (using findall ())

Question

Python regex: multiple matches on the same line (using findall ())

I am looking for these "tags" inside the text: {td="var1"}var2{/t} or {td="varA"}varB{/t} There may be more attributes, only "d" is required: {td="var1" foo="bar"}var2{/t}

My problem is if there are more tags on one line, only one result is returned, and not all of them. What is returned (from the test line below): (u'single1', u'Required item3')

What I expect to return: (u'single1', u'required1') (u'single2', u'Required item2') (u'single3', u'Required item3') I'm stuck with this. It works with one tag on each line, but does not contain more tags on one line.

 # -*- coding: UTF-8 -*- import re test_string = u''' <span><img src="img/ico/required.png" class="icon" alt="{td="single1"}required1{/t}" title="{td="single2"}Required item2{/t}" /> {td="single3"}Required item3{/t}</span> ''' re_pattern = ''' \{t[ ]{1} # start tag name d=" # "d" attribute ([a-zA-Z0-9]*) # "d" attribute content ".*\} # end of "d" attribute (.+) # tag content \{/t\} # end tag ''' rec_pattern = re.compile(re_pattern, re.VERBOSE) res = rec_pattern.findall(test_string) if res is not None: for item in res: print item

+4

python regex

dwich Jan 6 '13 at 13:05

source share

1 answer

Ned batchelder · Accepted Answer · 2013-01-06T13:17:34+0000

Your wildcards are greedy. Change them from .* To .*? so that they are not greedy:

 re_pattern = ''' \{t[ ]{1} # start tag name d=" # "d" attribute ([a-zA-Z0-9]*) # "d" attribute content ".*?\} # end of "d" attribute (.+?) # tag content \{/t\} # end tag '''

Python regex: multiple matches on the same line (using findall ())

More articles: