What is the difference between Lucene MoreLikeThis (mlt) and FuzzyQuery (flt)?
I evaluate both types of queries through Elasticsearch (ES), and I find that they are conceptually very similar:
mlt
flt
However, performance is fltabout an order of magnitude slower than the request mlt.
I am using the latest ES, which in turn uses Lucene 4.5.
From fuzzy like these docs:
Intimidates ALL terms provided as strings, and then selects the best n differentiating terms. Essentially, this mixes the behavior of FuzzyQuery and MoreLikeThis, but with special consideration for fuzzy counting factors. This, as a rule, gives good results for queries in which users can provide details in several fields and do not know the syntax of Boolean queries, and also want to get a degree of fuzzy matching and fast query.For each source word, fuzzy variations are stored in BooleanQuery without a coordinating factor (because we are not looking for matches across multiple variations in any document). In addition, the specialized TermQuery is used for variants and does not use these variants of IDF terms, because this will contribute to more rare terms, such as spelling errors. Instead, all variants use the same IDF rating (the one used for the original request), and this is taken into account when forcing variants. If the original query does not exist in the index, the average IDF options are used.
Intimidates ALL terms provided as strings, and then selects the best n differentiating terms. Essentially, this mixes the behavior of FuzzyQuery and MoreLikeThis, but with special consideration for fuzzy counting factors. This, as a rule, gives good results for queries in which users can provide details in several fields and do not know the syntax of Boolean queries, and also want to get a degree of fuzzy matching and fast query.
For each source word, fuzzy variations are stored in BooleanQuery without a coordinating factor (because we are not looking for matches across multiple variations in any document). In addition, the specialized TermQuery is used for variants and does not use these variants of IDF terms, because this will contribute to more rare terms, such as spelling errors. Instead, all variants use the same IDF rating (the one used for the original request), and this is taken into account when forcing variants. If the original query does not exist in the index, the average IDF options are used.
, . " ", , .
" " like_text fields. , , . , , , , .
like_text
fields
" " , . , , like_text, , like_text . , , - , , Lucene 4.x .