Python Regex - How to get positions and match values

How can I get the start and end positions of all matches using the re module? For example, given the pattern r'[az]' and the string 'a1b2c3d4' I would like to get the positions in which it finds each letter. Ideally, I would also like to return the text of the match.

+95
python regex
30 Oct '08 at 14:04
source share
4 answers
 import re p = re.compile("[az]") for m in p.finditer('a1b2c3d4'): print(m.start(), m.group()) 
+125
Oct 30 '08 at 14:15
source share

Taken from

HOWTO regular expression

span () returns both the start and end indexes in a single tuple. Since the matching method only checks to see if RE matches the beginning of the line, start () will always be zero. However, the RegexObject search method instances look at the string, so the match may not start from scratch in this case.

 >>> p = re.compile('[az]+') >>> print p.match('::: message') None >>> m = p.search('::: message') ; print m <re.MatchObject instance at 80c9650> >>> m.group() 'message' >>> m.span() (4, 11) 

Combine this with:

The finditer () method is also available in Python 2.2, returning a sequence of MatchObject instances as an iterator.

 >>> p = re.compile( ... ) >>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...') >>> iterator <callable-iterator object at 0x401833ac> >>> for match in iterator: ... print match.span() ... (0, 2) (22, 24) (29, 31) 

you can do something in order

 for match in re.finditer(r'[az]', 'a1b2c3d4'): print match.span() 
+47
30 Oct '08 at 14:16
source share

For Python 3.x

 from re import finditer for match in finditer("pattern", "string"): print(match.span(), match.group()) 

You should get \n split tuples (containing the first and last matching indexes respectively) and a match for each hit in the string.

+17
Jul 05 '17 at 13:08
source share

note that the range and group are indexed for multiple capture groups in the regular expression

 regex_with_3_groups=r"([az])([0-9]+)([AZ])" for match in re.finditer(regex_with_3_groups, string): for idx in range(0, 4): print(match.span(idx), match.group(idx)) 
0
Jul 23 '19 at 15:22
source share



All Articles