The regular expression matches the length of the word "1"

I am trying to parse product names that have several abbreviations for sizes. For example, the medium may be

m, medium, med 

I tried simple

 preg_match('/m|medium|med/i',$prod_name,$matches); 

which is great for "m xyz product". However, when I try "product s / m abc", I get a false positive match. I also tried

 preg_match('/\bm\b|\bmedium\b|\bmed\b/i',$prod_name,$matches); 

to make him find the word, but m in s / m is still matched. I assume this is because the engine treats the "/" in the title as a word delimiter?

So, to summarize, I need to match "m" in the string, but not "s / m" or "small", etc. Any help is appreciated.

+4
source share
3 answers
 %\b(?<![/-])(m|med|medium)(?![/-])\b% 

You can use a negative lookbehind or lookahead to exclude intruders. This means "m"/"med"/"medium" , which is its own word, but is not preceded or accompanied by a slash or dash. It also works at the beginning and at the end of a line, since a negative lookahead / lookbehind does not make the corresponding character present.

If you want to distinguish between spaces, you can use the positive version:

 %\b(?<=\s|^)(m|med|medium)(?=\s|$)\b% 

( "m"/"med"/"medium" preceded by spaces or the beginning of a line, and then a space or the end of a line)

+6
source

I always think about these things in ERE. And according to re_format (7), the boundaries of the word ERE, [[:<:]] and [[:>:]] correspond to the zero line at the beginning and end of the word respectively, So ... since preg needs to understand the ERE notation, I can go with:

 /[[:<:]](m(ed(ium)?)?)[[:>:]]/ 

Or for readability, perhaps:

 /[[:<:]](m|med|medium)[[:>:]]/ 

In PHP, however, you can use PREG instead of ERE. In PREG, \b indicates the word boundary, therefore:

 preg_match('/\b(m(ed(ium)?)?)\b/', $prod_name, $matches); 
+1
source

Try this, it should match medium , med and m .

 medium|med|^m$ 
0
source

Source: https://habr.com/ru/post/1415014/


All Articles