Using (most) regular expression engines, you can match , capture characters, and state positions within a string.
For this example, let's say the line
Rogue One: A Star Wars Story
where you want to combine the character o (which is there twice, after R and after t ). Now you want to indicate the position and want to match o only to the lower case R s.
You write (with a positive look):
o(?=r)
Now summarize the idea of zero-width statements , where you want to find the word symbol in front, making sure that there is no word next to it. Therefore, you can write:
(?=\w)(?<!\w)
Positive and negative outlook combined. We are almost there :) You only need the same thing (the word symbol behind and not the word text in front), which:
(?<=\w)(?!\w)
If you combine these two, you will eventually get (see | in the middle):
(?:(?=\w)(?<!\w)|(?<=\w)(?!\w))
This is equivalent to
\b (and much longer). Returning to our line, this is true for:
Rogue One: A Star Wars Story
Watch the demo at regex101.com .
In conclusion, you can think of
\b as
a zero-width statement that provides only position within the line.
Jan
source share