Option 1: Break the input request line into two parts at different points and perform a search. eg. In this case, the request will be (+ fo + bar) OR (+ foo + bar) OR (+ foob + ar). The problem is that this tokenization assumes the presence of two tokens in the input line of the request. In addition, you may get additional, possibly irrelevant results, such as results (+ foob + ar)
Option 2: use n-gram tokenization when indexing and querying. Although token indexing for "foo bar" would be fo, oo, ba, ar. When searching with foobar, tokens will be fo, oo, ob, ba, ar. When searching with the OR operator, you will get documents with maximum n-gram matches at the top. This can be achieved using NGramTokenizer.
source share