Ruby Regex vs Python Regex

Are there any real differences between a Ruby regex and a Python regex?

I could not find any differences in the two, but maybe I missed something.

+7
source share
5 answers

The last time I checked, they were significantly different in Unicode support. Ruby in 1.9, at least, has very limited Unicode support. I believe that one or two Unicode properties can be supported at this time. Probably the general categories and possibly the scenarios were the two that I think of.

Python has more and more Unicode support at the same time. Python seems to meet the requirements of RL1.2a "Compatibility Properties" from UTS # 18 in Unicode Regular Expressions .

However, Matthew Barnett (mrab) has developed a pretty good Python library that finally adds a couple of Unicode properties to Python regular expressions. It supports the two most important of them: general categories and script properties. It has other intriguing features. It deserves good publicity.

I don’t think that Ruby or Python supports Unicode so well, although more and more is done every day. In particular, however, neither one nor the other meets even the level 1 requirement for Unicode regular expressions given above. For example, RL1.2 requires support for at least 11 properties: General_Category, Script, Alphabetic, Uppercase, Lowercase, White_Space, Noncharacter_Code_Point, Default_Ignorable_Code_Point, ANY, ASCII, and ASSIGNED .

I think that Python only allows you to get to some of them, and only in a roundabout way. Of course, there are many, many other properties besides these 11.

When you are looking for Unicode support, in Regular Expressions, of course, there is not only UTS # 10, although this question is the most important for this question, and neither Ruby nor Puython correspond to level 1. Other very important aspects of Unicode are UAX # 15, UAX # 14, UTS # 18, UAX # 11, UAX # 29 and, of course, decisive UAX # 44. I know that Python has libraries for at least a couple of them. I do not know that they are standard.

But when it comes to supporting regular expressions, there are richer alternatives than just these two, you know. :)

+8
source

I like Perl-inspired / pattern / syntax in Ruby for regular expressions. Python re.compile (the "template") is not very elegant for me. The syntactic sugar in Ruby and the fact that regular expressions are a separate re module in Python makes me lean toward Ruby when it comes to regular expressions.

Also, I don't see much of a difference with the usual perspective of regex programming. Both languages ​​have fairly extensive and basically similar RE support. There may be differences in performance (Python traditionally has better performance), and Python also has more support for regular expressions in Unicode.

+5
source

If the question only concerns regular expressions: no. Use Perl.

You must choose between these languages ​​based on other non-regular expression issues you are trying to solve and community support in a language that is close to your area of ​​activity.

If you really choose a language based on regular expression support, select Perl ...

+3
source

The regular expression libraries for Ruby and Python are developed by two completely independent teams. Even if they are now identical (and I'm not sure that they are), there is no guarantee that they will not diverge in the future.

The safest position is to assume that they are different now, and suppose that in the future they will be different from each other.

+1
source

The Ruby Regexp#match method is equivalent to Python re.search() , not re.match() . re.search() and Regexp#match find the first match anywhere on the line. re.match() searches for a match only at the beginning of the line.

To fulfill the re.match() equivalent, a Ruby regular expression must begin with ^ , indicating that the beginning of the line matches.

To fulfill the Regexp#match equivalent, a Python regular expression must begin with .* , Indicating a match for zero or more characters.

+1
source

All Articles