Python re.finditer (): briefly define "A or: B or C: D"

I am looking for a regular expression that returns only three matching groups for the string " A: BC: D " where A, B, C, D are examples of words (\ w +) The following Python code prints unwanted (No, No).

I just want ("A", "No") ("No", "B") and "C", "D") using one regular expression (without adding python code to filter).

for m in re.compile(r'(?:(\w+)|)(?:(?::)(\w+)|)').finditer('A :BC:D'): print m.groups() 
+4
source share
2 answers

This can do the trick:

 (?=[\w:])(\w*)(?::(\w*))? 

(\w*)(?::(\w*))? describes the structure you want, but it has a problem that it also matches an empty string; thus, we must assure that at the beginning there is at least one non-spatial symbol (which will be agreed upon by greedy operators), and this looks at the beginning.

Edit: wrong insertion :)

+4
source
 import re print([m.groups() for m in re.finditer( r'''(?x) # verbose mode (\w+)? # match zero-or-more \w's (?: :|\s) # match (non-groupingly) a colon or a space (\w+ (?:\s|\Z))? # match zero-or-more \w followed by a space or EOL ''', 'A :BC:D')]) 

gives

 [('A', None), (None, 'B '), ('C', 'D')] 
0
source

All Articles