Regex named capture groups in Delphi XE

I created a matching pattern in RegexBuddy that behaves exactly as I expect. But I can not pass this to Delphi XE, at least when using the latest built-in TRegEx or TPerlRegEx.

There are 6 capture groups in my real world code, but I can illustrate the problem in a lighter example. This code gives β€œ3” in the first dialog box, and then throws an exception (-7 index outside) when the second dialog is executed.

var Regex: TRegEx; M: TMatch; begin Regex := TRegEx.Create('(?P<time>\d{1,2}:\d{1,2})(?P<judge>.{1,3})'); M := Regex.Match('00:00 X1 90 55KENNY BENNY'); ShowMessage(IntToStr(M.Groups.Count)); ShowMessage(M.Groups['time'].Value); end; 

But if I use only one capture group

 Regex := TRegEx.Create('(?P<time>\d{1,2}:\d{1,2})'); 

β€œ2” is displayed in the first dialog box, and β€œ00:00” time is displayed in the second dialog box, as expected.

However, this would be a little limited if only one group of named captures were allowed, but that is not so ... If I change the name of the capture group, for example, to "atime".

 var Regex: TRegEx; M: TMatch; begin Regex := TRegEx.Create('(?P<atime>\d{1,2}:\d{1,2})(?P<judge>.{1,3})'); M := Regex.Match('00:00 X1 90 55KENNY BENNY'); ShowMessage(IntToStr(M.Groups.Count)); ShowMessage(M.Groups['atime'].Value); end; 

I will get "3" and "00:00" as expected. Are there any reserved words that I cannot use? I don’t think so, because in my real example I tried completely random names. I just can't figure out what causes this behavior.

+8
regex delphi delphi-xe regexbuddy
source share
2 answers

When pcre_get_stringnumber does not find the name, PCRE_ERROR_NOSUBSTRING returned.

PCRE_ERROR_NOSUBSTRING defined in RegularExpressionsAPI as PCRE_ERROR_NOSUBSTRING = -7 .

Some tests show that pcre_get_stringnumber returns PCRE_ERROR_NOSUBSTRING for each name that has the first letter in the range k to z , and this range depends on the first letter in judge . Changing judge to something else changes the range.

As I see, there are two errors associated with this. One from pcre_get_stringnumber and one in TGroupCollection.GetItem, which should throw the correct exception instead of SRegExIndexOutOfBounds

+7
source share

The error seems to be in the RegularExpressionsAPI module, which wraps the PCRE library or the PCRE OBJ files that it links. If I run this code:

 program Project1; {$APPTYPE CONSOLE} uses SysUtils, RegularExpressionsAPI; var myregexp: Pointer; Error: PAnsiChar; ErrorOffset: Integer; Offsets: array[0..300] of Integer; OffsetCount, Group: Integer; begin try myregexp := pcre_compile('(?P<time>\d{1,2}:\d{1,2})(?P<judge>.{1,3})', 0, @error, @erroroffset, nil); if (myregexp <> nil) then begin offsetcount := pcre_exec(myregexp, nil, '00:00 X1 90 55KENNY BENNY', Length('00:00 X1 90 55KENNY BENNY'), 0, 0, @offsets[0], High(Offsets)); if (offsetcount > 0) then begin Group := pcre_get_stringnumber(myregexp, 'time'); WriteLn(Group); Group := pcre_get_stringnumber(myregexp, 'judge'); WriteLn(Group); end; end; except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; ReadLn; end. 

It prints -7 and 2 instead of 1 and 2.

If I remove the RegularExpressionsAPI from the uses and add the pcre element from my TPerlRegEx component , it will print 1 and 2 correctly.

RegularExpressionsAPI in Delphi XE is based on my pcre module, and the RegularExpressionsCore block RegularExpressionsCore based on my PerlRegEx module. Embarcadero has made some changes to both devices. They also compiled their own OBJ files from the PCRE library, which are linked by the RegularExpressionsAPI .

I reported this error as QC 92497

I also created a separate QC 92498 report to request TGroupCollection.GetItem more reasonable exception when querying a named group that does not exist. (This code is in the RegularExpressions module, which is based on code written by Vincent Parrett, not me.)

+5
source share

All Articles