Redefining a Regex Task Group Name?

So, I have this regex:

(^(\s+)?(?P<NAME>(\w)(\d{7}))((01f\.foo)|(\.bar|\.goo\.moo\.roo))$|(^(\s+)?(?P<NAME2>R1_\d{6}_\d{6}_)((01f\.foo)|(\.bar|\.goo\.moo\.roo))$)) 

Now, if I try to match this:

  B048661501f.foo

I get this error:

  File "C: \ Python25 \ lib \ re.py", line 188, in compile
     return _compile (pattern, flags)
   File "C: \ Python25 \ lib \ re.py", line 241, in _compile
     raise error, v # invalid expression
 sre_constants.error: redefinition of group name 'NAME' as group 9;  was group 3

If I cannot define the same group twice in the same regex expression for two different cases, what should I do?

+10
source share
3 answers

No, you cannot have two groups with the same name, this somehow would not justify the goal, would you?

What you probably really want is:

 ^\s*(?P<NAME>\w\d{7}|R1_(?:\d{6}_){2})(01f\.foo|\.(?:bar|goo|moo|roo))$ 

I reorganized your regex as far as possible. I made the following assumptions:

You want (correct me if I am wrong):

  • ignore space at the beginning of a line
  • matches any of the following elements in a group named "NAME":
    • a letter followed by 7 digits, or
    • "R1_" and two times (6 digits + "_" )
  • followed by:
    • "01f.foo" or
    • "." and ( "bar" or "goo" or "moo" or "roo" )
  • followed by the end of the line

You could also keep in mind:

 ^\s*(?P<NAME>\w\d{7}01f|R1_(?:\d{6}_){2})\.(?:foo|bar|goo|moo|roo)$ 

What is:

  • ignore space at the beginning of a line
  • matches any of the following elements in a group named "NAME":
    • letter followed by 7 digits and "01f"
    • "R1_" and two times (6 digits + "_" )
  • point
  • "foo" , "bar" , "goo" , "moo" or "roo"
  • end of line
+7
source

Reusing the same name makes sense in your case, as opposed to Tamalak's answer.

Your regular expression is compiled using python2.7 as well as re2. Perhaps this problem has been resolved.

+6
source

The following answer is about how to get the above regular expression to work in Python3.

Since the re2 module proposed by Max will not work in Python3 due to NameError: basestring . Another alternative to this is the regex module.

The regex module is simply an extended version of re with additional advanced features. This module also allows you to have the same group names in the regular expression.

You can install it through:

 sudo pip install regex 

And if you have already used re or re2 in your program. Just do the following to import the regex module

 import regex as re 
0
source

All Articles