SearchBuf soWholeWord unexpected output

When testing StrUtils.SearchBuf with the [soWholeWord,soDown] option [soWholeWord,soDown] some unexpected results occurred.

 program Project1; Uses SysUtils,StrUtils; function WordFound(aString,searchString: String): Boolean; begin Result := SearchBuf(PChar(aString),Length(aString), 0, 0, searchString, [soWholeWord,soDown]) <> nil; end; Procedure Test(aString,searchString: String); begin WriteLn('"',searchString,'" in "',aString,'"',#9,' : ', WordFound(aString,searchString)); end; begin Test('Delphi','Delphi'); // True Test('Delphi ','Delphi'); // True Test(' Delphi','Delphi'); // False Test(' Delphi ','Delphi'); // False ReadLn; end. 

Why are ' Delphi' and ' Delphi ' considered a whole word?

How about reverse search?

 function WordFoundRev(aString,searchString: String): Boolean; begin Result := SearchBuf(PChar(aString),Length(aString),Length(aString)-1,0,searchString, [soWholeWord]) <> nil; end; Procedure TestRev(aString,searchString: String); begin WriteLn('"',searchString,'" in "',aString,'"',#9,' : ', WordFoundRev(aString,searchString)); end; begin TestRev('Delphi','Delphi'); // False TestRev('Delphi ','Delphi'); // True TestRev(' Delphi','Delphi'); // False TestRev(' Delphi ','Delphi'); // True ReadLn; end. 

I do not understand anything. In addition, the function does not work.

Same results in XE7, XE6 and XE.


Refresh

QC127635 StrUtils.SearchBuf does not work with the [soWholeWord] option

+6
source share
1 answer

This seems like a mistake. Here is the code that performs the search:

 while SearchCount > 0 do begin if (soWholeWord in Options) and (Result <> @Buf[SelStart]) then if not FindNextWordStart(Result) then Break; I := 0; while (CharMap[(Result[I])] = (SearchString[I+1])) do begin Inc(I); if I >= Length(SearchString) then begin if (not (soWholeWord in Options)) or (SearchCount = 0) or ((Byte(Result[I])) in WordDelimiters) then Exit; Break; end; end; Inc(Result, Direction); Dec(SearchCount); end; 

Each time in the while we check if there is soWholeWord in the parameters, and then go to the beginning of the next word. But we do this only if

 Result <> @Buf[SelStart] 

Now Result is the current pointer to the buffer, a candidate for a match. And so this test checks to see if we are at the beginning of the string search.

What this test means is that we cannot move forward from non-alphanumeric text to the beginning of the first word if the search string starts with non-alphanumeric text.

Now you can remove the test for

 Result <> @Buf[SelStart] 

But if you do, you will find that you no longer agree with the word if it is located right at the beginning of the line. So you just fail differently. The right way to handle this is to make sure FindNextWordStart does not advance if we are at the beginning of the line and there is an alphanumeric in the text.

I assume that the original author wrote this code:

 if (soWholeWord in Options) then if not FindNextWordStart(Result) then Break; 

Then they found that the words at the beginning of the line did not match and would not change the code:

 if (soWholeWord in Options) and (Result <> @Buf[SelStart]) then if not FindNextWordStart(Result) then Break; 

And no one tested what happened if the line started with non-alphanumeric text.

Something like this seems to be done:

 if (soWholeWord in Options) then if (Result <> @Buf[SelStart]) or not Result^.IsLetterOrDigit then if not FindNextWordStart(Result) then Break; 
+4
source

All Articles