Headache in regular expression

I want to check some C # source code for a scripting engine. I want to make sure that only members of the System.Math class can refer to it. I am trying to create a regular expression that will match a point, followed by a capital letter, followed by any number of word characters ending in a word boundary that is NOT preceded by System.Math.

I started with this:

(?<!Math)\.[AZ]+[\w]* 

Which is great for:

 return Math.Max(466.89/83.449 * 5.5); // won't flag this return Xath.Max(466.89/83.449 * 5.5); // will flag this 

It matches .Max correctly when it is not preceded by mathematics. However, now when I try to extend the regex to include System, I cannot get it to work.

I tried these regex permutations and more:

 ((?<!System\.Math)\.[AZ]+[\w]*) ((?<!(?<!System)\.Math)\.[AZ]+[\w]*) ((?<!System)\.(?<!Math)\.[AZ]+[\w]*) ((?<!System)|(?<!Math)\.[AZ]+[\w]*) ((?<!System\.Math)|(?<!Math)\.[AZ]+[\w]*) 

Using these statements:

 return System.Math.Max(466.89/83.449 * 5.5); return System.Xath.Max(466.89/83.449 * 5.5); return Xystem.Math.Max(466.89/83.449 * 5.5); 

I tried everything I could think of, but it either ALWAYS matches the second element (.Math or .Xath above), or doesn't match anyone.

If someone had mercy on me and indicated what I was doing wrong, I would be very carried away.

Thanks in advance, Welton

+6
regex
source share
2 answers

The trick is to never start matching the member name anywhere, but at the beginning. Then just try to take a look at what you are looking for, starting with System.Math. . Try this regex:

 (?<![\w.])(?!(?:System\.)?Math\.)(?:[AZ]\w*\.)+[AZ]\w*\b 

Lookbehind ensures that the match does not start in the middle of the word ( \w ) or in the middle of the qualified member name ( . ). Now, if lookahead failed, it cannot simply go to the beginning of the next component (for example, Math. In System.Math. ) And try again. This is all or nothing.

However, this will be consistent with Math.Max if it is not preceded by System. . Do you really need this, or was it just an intermediate step in developing a regular expression for a full name?

EDIT: I went ahead and made part of the System. optional.

+2
source share

If you are just looking for what you specified in the example, this regular expression will do it.

^[\w\s]*?[AZ]\w+\.[AZ]\w+\.(?<!System\.Math\.)

It matches all calls other than System.Math.XXX, as long as: a) there are two in the call . , b) this call is on the same line.

 return System.Math.Max(466.89/83.449 * 5.5); // no match return System.Xath.Max(466.89/83.449 * 5.5); // match return Xystem.Math.Max(466.89/83.449 * 5.5); // match System.Math.Max(466.89/83.449 * 5.5); // no match System.Xath.Max(466.89/83.449 * 5.5); // match Xystem.Math.Max(466.89/83.449 * 5.5); // match return System.Math.Max(466.89/83.449 * 5.5); // no match return System.Xath.Max(466.89/83.449 * 5.5); // match return Xystem.Math.Max(466.89/83.449 * 5.5); // match Math.Max(466.89/83.449 * 5.5); // no match - only one '.' System.Max.Math(466.89/83.449 * 5.5); // match 

I agree with the comments; Any regular expression is rather fragile, and should only be considered as a text edition of the type. You need a parser if you want it to be bulletproof.

+2
source share

All Articles