Check if the string contains the word, but only at a specific position?

How to check if a string contains a substring, but only at a specific position?

Example line:

What is your favorite color? my [ favorite ] color is blue

If I wanted to check if a string contains a specific word, I usually do this:

var S: string; begin S := 'What is your favorite color? my [favorite] color is blue'; if (Pos('favorite', S) > 0) then begin // end; end; 

I need to determine if the word favorite exists in the string, ignoring it, although if it appears inside the characters [], this code does not explicitly do this.

So, if we put the code in a boolean function, some sample results will look like this:

TRUE: What is your favorite color? my [ my favorite ] color is blue

TRUE: What is your favorite color? my [ blah blah ] color is blue

FALSE: What is your color blah blah ? my [ favorite ] color is blue

The first two examples above are true because the favorite word is outside the characters [], regardless of whether it is inside them or not.

The third pattern is false, because, despite the fact that there is a favorite word, it appears only inside the characters [], we only need to check whether it exists outside the characters.

So, I need a function to determine if a word appears in the string (favorites in this example), but the fact is ignored if the word is surrounded inside [] characters.

+6
source share
3 answers

I like the Sertac idea of removing parenthesized strings and finding the string after that. Here is a sample code, extended by searching for whole words and case sensitivity:

 function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True; ACaseSensitive: Boolean = False): Boolean; var S: string; BracketEnd: Integer; BracketStart: Integer; SearchOptions: TStringSearchOptions; begin S := AText; BracketEnd := Pos(']', S); BracketStart := Pos('[', S); while (BracketStart > 0) and (BracketEnd > 0) do begin Delete(S, BracketStart, BracketEnd - BracketStart + 1); BracketEnd := Pos(']', S); BracketStart := Pos('[', S); end; SearchOptions := [soDown]; if AWholeWord then Include(SearchOptions, soWholeWord); if ACaseSensitive then Include(SearchOptions, soMatchCase); Result := Assigned(SearchBuf(PChar(S), StrLen(PChar(S)), 0, 0, AWord, SearchOptions)); end; 

Here is an optimized version of a function that uses iteration of a char pointer without string manipulation. Compared to the previous version, this handles the case when you have a line with a missing closing bracket, for example, My [favorite color is . Such a string evaluates to True because of this missing bracket.

The principle is to go through the whole string char to char, and when you find the open bracket, see if this bracket has a closing pair for itself. If yes, then check if the substring is from the saved position until the opening bracket contains the search word. If yes, exit the function. If not, move the saved position to the closing bracket. If the opening bracket does not have its own closing pair, find the word from the saved position to the end of the entire line and exit the function.

For a commented version of this code, follow this link .

 function ContainsWord(const AText, AWord: string; AWholeWord: Boolean = True; ACaseSensitive: Boolean = False): Boolean; var CurrChr: PChar; TokenChr: PChar; TokenLen: Integer; SubstrChr: PChar; SubstrLen: Integer; SearchOptions: TStringSearchOptions; begin Result := False; if (Length(AText) = 0) or (Length(AWord) = 0) then Exit; SearchOptions := [soDown]; if AWholeWord then Include(SearchOptions, soWholeWord); if ACaseSensitive then Include(SearchOptions, soMatchCase); CurrChr := PChar(AText); SubstrChr := CurrChr; SubstrLen := 0; while CurrChr^ <> #0 do begin if CurrChr^ = '[' then begin TokenChr := CurrChr; TokenLen := 0; while (TokenChr^ <> #0) and (TokenChr^ <> ']') do begin Inc(TokenChr); Inc(TokenLen); end; if TokenChr^ = #0 then SubstrLen := SubstrLen + TokenLen; Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord, SearchOptions)); if Result or (TokenChr^ = #0) then Exit; CurrChr := TokenChr; SubstrChr := CurrChr; SubstrLen := 0; end else begin Inc(CurrChr); Inc(SubstrLen); end; end; Result := Assigned(SearchBuf(SubstrChr, SubstrLen, 0, 0, AWord, SearchOptions)); end; 
+8
source

In regular expressions there is a thing called look-around you can use. In your case, you can solve the problem with a negative lookbehind: you want a "favorite" if it is not preceded by an opening bracket. It might look like this:

 (?<!\[[^\[\]]*)favorite 

Step by step: (?<! Is the negative lookbehind prefix, we are looking for \[ optional, followed by none or more things that do not close or open the brackets: [^\[\]]* , close the negative lookbehind with ) and then favorite right after.

+7
source

I think you can change your problem as "find the cast of a string not surrounded by square brackets." If this describes your problem, you can go ahead and use a simple regular expression like [^\[]favorite[^\]] .

0
source

Source: https://habr.com/ru/post/925165/


All Articles