Strange full-text MySQL search results, need an explanation

my SQL query
SELECT keyword
FROM table
WHERE MATCH (keyword)
AGAINST ('eco*' IN BOOLEAN MODE);

matches cells with these words: economy , ecology , echoscopy (why?), echo (why?), etc.

another SQL query
SELECT keyword
FROM table
WHERE MATCH (keyword)
AGAINST ('eci*' IN BOOLEAN MODE);

matches the cell with the word: echidna .

However, both queries do not match the word ectoplasm .

Why do echo , echoscopy correspond to 'eco*' and echidna correspond to 'eci*' ?

I see a key element in this problem: the combination of letters " ch ".

Why does it work this way and how can I avoid such a match?

+4
source share
3 answers

The problem (function?) Was in sorting. "c" and "ch" were treated equal due to utf8_lithuanian_ci matching.

Edit:

Changing the collation in utf8_unicode_ci fixes only certain problems.

The real solution is to use utf8_bin , which corresponds to the binary values ​​of each character, which means:

  • case sensitive
  • sensitivity to diacritics
+1
source

The reason it matches is because MATCH ... AGAINST uses regular expressions, and the * character means that the previous char ("o") can be from 0 to 9999999999999999999 ^ times. What you wanted to match is

 eco.* 

It will match "eco" and "ecology", but not "echo".

 eco.+ 

The “ecology” and “ecological system” will correspond, but not “eco” or “echo”.

0
source

Maybe you can try this

 SELECT keyword FROM table WHERE keyword LIKE 'eco%'; 
-1
source

All Articles