What is the best lucene setting for ranking exact matches as the highest

Which parsers should I use for indexing and searching when I want an exact match to be higher than a "partial" match? Is it possible to customize my own result in the class Similarity?

For example, when my index consists of car parts, carand car shop(with an index StandardAnalyzeron lucene 3.5), a query for "car"results in:

  • car parts
  • car
  • car shop

(basically returned in the order in which they were added, since they all get the same score).

What I would like to see is carranked first, then other results (it doesn't really matter which order, I suppose, the analyzer can affect this).

+5
source share
2 answers

All three matches are exact (the term "car" matches, not "ca" or "ar") :)

If there are no more content in these fields ("auto parts", "car" and "car shop"), you can use lengthNorm()or computeNorm()(depending on the Lucene version) to give shorter fields more so that the car gets a higher score for the fact that it in short. In Lucene 3.3.0, DefaultSimilarity.computeNorm () looks like this:

return state.getBoost() * ((float) (1.0 / Math.sqrt(numTerms)));

numTerms - . "" "" , "" 1, "" 0,7 ( 1).

+2

: ScoreDoc[] IndexSearcher.search score ( ) .

0

All Articles