MatchObject regex MatchObject include match indexes. It remains to combine duplicate characters:
import re repeat = re.compile(r'(?P<start>[az])(?P=start)+-?')
will match only if the repeated character of the letter ( a - z ) is repeated at least once:
>>> for match in repeat.finditer("aaaaabbbbbbbbbbbbbbccccccccccc"): ... print match.group(), match.start(), match.end() ... aaaaa 0 5 bbbbbbbbbbbbbb 5 19 ccccccccccc 19 30
The .start() and .end() methods of the .start() result give you exact positions in the input string.
In matches hyphens are included, but non-repeating characters:
>>> for match in repeat.finditer("a-bb-cccccccc"): ... print match.group(), match.start(), match.end() ... bb- 2 5 cccccccc 5 13
If you want the a-part to be a match, just replace + with the factor * :
repeat = re.compile(r'(?P<start>[az])(?P=start)*-?')
source share