We should cover all cases of beta names where the regex should match.
So, we begin to write a template with the first beta example "Crome beta" :
' [Bb]eta'
We use [Bb] to match B or B in second place.
The second "Crome_beta" example adds _ as a delimiter:
'[ _][Bb]eta'
The third example, "Crome beta2" and the fourth, "Crome_betaversion" covered by the last regular expression.
The fifth example of "Crome 3beta" forces us to change the template this way:
'[ _]\d*[Bb]eta'
where \d is a replacement for [0-9] and * allows from 0 to infinity elements \d .
The sixth example of "CromeBeta2.3" shows that beta cannot have a preceding _ or space, just start with capital. Therefore, we cover it with the construction | which matches the or operator in Python:
'[ _]\d*[Bb]eta|Beta'
The seventh example of Beta Crome 4 matches the smallest regular expression (since it starts with Beta ). But it could also be beta Chrome 4 , so we could change the template this way:
'[ _]\d*[Bb]eta|Beta|^beta '
We do not use ^[Bb]eta , since Beta has already been reviewed.
Also, I must mention, we cannot use re.I , since we must distinguish between Beta and Beta in regular expression.
So, the test code (for Python 2.7):
from __future__ import print_function import re, sys match_tests = [ "Crome beta", "Chrome Beta", "Crome_beta", "Crome beta2", "Crome_betaversion", "Crome 3beta" , "Crome 3Beta", "CromeBeta2.3", "Beta Crome 4", "beta Chrome ", "Cromebeta2.3" #no match, "betamax" #no match, "Betamax"] compiled = re.compile(r'[ _]\d*[Bb]eta|Beta|^beta ') for test in match_tests: search_result = compiled.search(test) if search_result is not None: print("{}: OK".format(test)) else: print("{}: No match".format(test), file=sys.stderr)
I do not see the need to use a negative lookbehind. In addition, you used the capture group (beta) (brackets). And there is no need for this. This will simply slow down the regular expression.