How to understand the regular expression '\ b'?

I am learning regular expression. But I can not understand "\ b", match the border of the word. There are three situations, for example:

  • Before the first character in a string, if the first character is a word character.
  • After the last character in a string, if the last character is a character in a word.
  • Between two characters in a string, where one is a word symbol and the other is not a word symbol.

I can not understand the third situation. For example:

var reg = /end\bend/g; var string = 'wenkend,end,end,endend'; alert( reg.test(string) ) ; //false 

"\ b" requires the character "\ w" on the one hand, the other - "\ w" on the other hand. the string "end, end" must match the rule, after the first character there is a string "," before the last character is a string ",", so why the result is an error. Could you help, thanks in advance!

============= dividing line ==============

With your help, I understand that. "end, end" coincides with the first "end" and has a border, but the next character "," is not "e", so "/ end \ bend" is false.

In other words, reg '/ end \ bend / g' or other similar regs do not exit forever. Thanks again

+7
javascript string regex
source share
3 answers

\b matches the position, not the character. So this regular expression /end\bend/g says that the end line should be. Then it should be followed by a word character that is equal , , and it matches, but the regular expression mechanism does not move in the line and remains in,. So, the next character in your regular expression is e , and e does not match,. Therefore regexp fails. Here is a step-by-step what is happening:

 ----------------- /end\bend/g, "end,end" (match) | | ----------------- /end\bend/g, "end,end" (both regex and string position moved - match) | | ------------------ /end\bend/g, "end,end" (the previous match was zero-length, so only regex position moved - not match) | | 
+4
source share

Using (most) regular expression engines, you can match , capture characters, and state positions within a string.

For this example, let's say the line

 Rogue One: A Star Wars Story 

where you want to combine the character o (which is there twice, after R and after t ). Now you want to indicate the position and want to match o only to the lower case R s.
You write (with a positive look):

 o(?=r) 

Now summarize the idea of zero-width statements , where you want to find the word symbol in front, making sure that there is no word next to it. Therefore, you can write:

 (?=\w)(?<!\w) 

Positive and negative outlook combined. We are almost there :) You only need the same thing (the word symbol behind and not the word text in front), which:

 (?<=\w)(?!\w) 

If you combine these two, you will eventually get (see | in the middle):

 (?:(?=\w)(?<!\w)|(?<=\w)(?!\w)) 


This is equivalent to \b (and much longer). Returning to our line, this is true for:
  Rogue One: A Star Wars Story # right before R # right after e in Rogue # right before O of One # right after e of One (: is not a word character) # and so on... 

Watch the demo at regex101.com .


In conclusion, you can think of \b as a zero-width statement that provides only position within the line.
+3
source share

Try this expression

 /(end)\b|\b(end)/g 
0
source share

All Articles