How to find "FooBar" while passing "Foo Bar" in Zend Lucene

I am creating a search function for a php website using Zend Lucene and I have a problem. My website is a store director (something like that).

For example, I have a store called "FooBar", but my visitors watch "Foo Bar" and get zero results. Also, if the store is called "Foo Bar" and the visitor visits "FooBar", nothing was found.

I tried to find "foobar ~" (fuzzy acquaintance), but did not find an article called "Foo Bar"

Is there any specific way to create an index or query?

+4
source share
4 answers

Option 1: Break the input request line into two parts at different points and perform a search. eg. In this case, the request will be (+ fo + bar) OR (+ foo + bar) OR (+ foob + ar). The problem is that this tokenization assumes the presence of two tokens in the input line of the request. In addition, you may get additional, possibly irrelevant results, such as results (+ foob + ar)

Option 2: use n-gram tokenization when indexing and querying. Although token indexing for "foo bar" would be fo, oo, ba, ar. When searching with foobar, tokens will be fo, oo, ob, ba, ar. When searching with the OR operator, you will get documents with maximum n-gram matches at the top. This can be achieved using NGramTokenizer.

+2
source

Manually add index entries for most common name misunderstandings. Ask your customers to enter them in a special form.

+1
source

Have you tried "* foo * AND * bar *" or "* foo * OR * bar *"? He works at Ferret, and I read that he is based on Lucene.

0
source

If you don't care about performance, use WildcardQuery (performance is much worse):

new WildcardQuery( new Term( "propertyName", "Foo?Bar" ) ); 

For zero or more characters use '*', for zero or one character use '?'

If performance is important, try using BooleanQuery.

0
source

All Articles