Is there a common / standard subset of regular expressions?

Do the "control characters" used in regular expressions use many different implementations of regular expression parsers (for example, regex in Ruby, Java, C #, sed, etc.).

For example, in Ruby \D means not a digit ; Does this mean the same in Java, C # and sed? I assume that I am asking if there is a β€œstandard” for regular expressions that all regular expression parsers support?

If not, is there any general subset that should be studied and mastered (and then studied specific to parsers, how do they occur)?

+8
java c # ruby regex
source share
2 answers

See the list of basic syntax regular-expressions.info .

And a comparison of different "flavors."

+8
source share

There is a common core that is very simple. It matches regular expressions implemented in original software tools such as ed, grep, sed and awk. This is worth exploring because other formats are all supersets of this. †

 . match any character [abc] match a, b, or c [^abc] match a character other than a, b, or c [ac] match the range from a to c ^ match the begininning of the line $ match the end of the line * match zero or more of the preceding character \(...\) group for use as a back-reference 

† I missed Posix expressions because no one uses them and they are not in a subset. The default parano is magic, with the exception of classic expressions.

+1
source share

All Articles