A regular expression that excludes certain characters

Question

A regular expression that excludes certain characters

In this tutorial, I don’t understand why [^b] is wrong? I understand that [^bog] correct.

[^b] must match any string that does not have a b character and does not match any string containing a b character.

Is there something wrong in my understanding?

+12

regex

Saidbakr Apr 28 '14 at 22:10

source share

4 answers

You are fundamentally true, but [^b] will still match o and g in bog - this means that it is a successful match, although it does not match the entire string. [^bog] will match only h in hog , d in dog , and nothing in bog means that it does not match bog .

I think this will make more sense if you look ^[^b]+$ . This will match 1+ non b characters anchored to the beginning ( ^ ) and end ( $ ) of the string. Comparing this with your original expression [^b] or [^bog] , you can see the difference. I suggest using the RegEx GUI tester (the previously linked one of them is my favorite), which will really help illustrate the logic.

+4

Sam Apr 28 '14 at 22:16

source share

[^b] will match only one character that is not a "b".
[^b]+ will indicate that the RegEx group matches one or more characters that are not "b".
[^b]* will indicate that the RegEx group matches zero or more characters that are not "b".

+3

dgp Apr 28 '14 at 22:16

source share

^[^b] works.

^ OUT [] indicates "beginning of line"

+2

Douglas denhartog Apr 28 '14 at 22:16

source share

Pedro lobito · Accepted Answer · 2014-04-28T22:23:13+0000

For this particular lesson, the correct regex :

 [^b]og

EXPLANATION:

 /[^b]og/ [^b] match a single character not present in the list below b the literal character b (case sensitive) og matches the characters og literally (case sensitive)

NOTES:

Negative Character Classes
Entering a carriage after the opening square bracket cancels the character class . As a result, the character class matches any character that is not in the character class. Unlike periods, classes of negative characters also correspond to (invisible) line break characters. If you do not want the negative character class to match line breaks, you need to include line break characters in the class. [^ 0-9 \ r \ n] matches any character that is not a digit or a line feed.
It is important to remember that the class of negative characters must still match the character. q [^ u] does not mean: " aq is not followed by au ". This means: " aq is followed by a symbol that is not au ." It does not match q in the string Iraq . It corresponds to q and the space after q in Iraq is a country . Indeed: the space becomes part of the general match because it is the “ character that is not au ” that matches the class of negative characters in the above regular expression. If you want the regular expression to match q , and only q , in both lines, you need to use a negative prediction: q (?! U) . But we will come back to this later.

A regular expression that excludes certain characters

More articles: