Solr proximity approximation versus disordered

In Solr, you can do an ordered search for proximity using syntax

"word1 word2"~10 

By order, I mean that word1 will always be before word2 in the document. I would like to know if there is an easy way to do an unordered proximity search, i.e. word1 and word2 are found within 10 words of each other, and it does not matter what comes first.

One way to do this:

 "word1 word2"~10 OR "word2 word1"~10 

The above will work, but I'm looking for something simpler if possible

Thanks at Advance Ruth

+4
source share
3 answers

Slop means how many word permutations can exist. Thus, "ab" will be different from "ba" because a different number of transpositions is allowed.

  • a foo b has positions (a, 1), (foo, 2), (b, 3). To match (a, 1), (b, 2), one change is required: (b, 2) => (b, 3)
  • However, to match (b, 1), (a, 2), you will need (a, 2) => (a, 1) and (b, 1) => (b, 3) for a total of three position movements.

In general, if "ab"~n matches something, then "ba"~(n+2) will also match it.

EDIT: I guess I never gave an answer. I see two options:

  • If you need a slop from n, increase it to n + 2
  • Manually disable the search as you suggested

I think # 2 is probably better, unless your slop is really big to start with.

+7
source

Are you sure this is no longer working? There is nothing in the documentation about what he “ordered”:

You can search for proximity using an inaccurate phrase request. The closer to each other these two conditions appear in the document, the higher the score. A sloppy phrase request indicates the maximum “slop”, or the number of position markers must be moved to get a match.

In this example, for a standard query handler, all documents will be found where "batman" appears in 100 words "movie":

http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_search_for_one_term_near_another_term_.28say.2C_.22batman.22_and_.22movie.22.29

+2
source

Since Solr 4 is possible with SurroundQueryParser .

eg. to perform an ordered search (query where "phrase two" follows the phrase "no more than 3 words after):

 3W(phrase W one, phrase W two) 

Perform an unordered search (query "phrase two" in the immediate vicinity of 5 words of phrase one):

 5N(phrase W one, phrase W two) 
+1
source

All Articles