Dollar sign "\ $" in regular expressions with word boundaries "\ b" (PHP / JavaScript)

I know that the problem with the dollar sign "$" in regex (here: either in PHP, and in JavaScript) has been discussed many times: Yes, I know that I need to add a backslash "\" in front (depending on line processing even two), but the correct way to match the dollar sign is "\ $" .... I was there, did it, it works great.


But here is my new problem: the dollar signs "\ $" next to the word boundaries indicated by "\ b" .... My following examples can be easily reproduced, for example. regexpal.com.

Let's start with the following text to search in:

50 dollar

50 dollars

$ 50

USD 50

My regular expression should find either "dollar", "dollar", or "dollar". Easy enough: try

(USD | Dollar | \ $)

Success: he finds "$", "dollar" and both "dollars", including in "dollars".

But try skipping the "Dollars" by adding word boundaries after multiple selections:

(USD | Dollar | \ $) \ b p>

And this is the problem: the “dollar” is matched, the “dollar” is matched, the “dollars” are rejected ... But the single, properly backward (or elusive) “$” is also rejected, although it worked just a second earlier.

This is not related to multiple selection inside brackets: just try

\ $

vs.

\ $ \ B p>

and this is one and the same: the first corresponds to the dollar sign, the second does not.


Another conclusion:

(USD | Dollar | \ $) \ b

with a space “between”) "and" \ b "really works. But this workaround may not be practical in any circumstances (in case there should be a border without spaces).


It seems that a cursory dollar sign refuses to be found when word boundaries are involved.

I would like to hear your suggestions to solve this mystery. - Thanks a lot in advance!

+7
source share
1 answer

This does not match, because in $ the word boundary does not exist immediately after $ . However, it would be if the word began immediately after $ - for example

$ Millions

will match.

What you probably want to do is make \b only apply to cases where you really want to match the word boundary - for example

 (USD\b|Dollar\b|\$) 

This will insist on having a word boundary after the “dollar” and “dollar,” but not after the “$”.

+2
source

All Articles