Luke Lucene BooleanQuery

Question

Luke Lucene BooleanQuery

In Luke, the following search expression returns 23 results:

docurl:www.siteurl.com docfile:Tomatoes*

If I pass the same expression to my C # Lucene.NET application with the following implementation:

  IndexReader reader = IndexReader.Open(indexName); Searcher searcher = new IndexSearcher(reader); try { QueryParser parser = new QueryParser("docurl", new StandardAnalyzer()); BooleanQuery bquery = new BooleanQuery(); Query parsedQuery = parser.Parse(query); bquery.Add(parsedQuery, Lucene.Net.Search.BooleanClause.Occur.MUST); int _max = searcher.MaxDoc(); BooleanQuery.SetMaxClauseCount(Int32.MaxValue); TopDocs hits = searcher.Search(parsedQuery, _max) ... }

I get 0 results

Luke uses StandardAnalyzer, and it looks like this: Luke query structure

Should I manually create BooleanClause objects for each field in which I search, specifying Should for each of them, and then add them to the BooleanQuery object using .Add() ? I thought QueryParser would do it for me. What am I missing?

Edit: Simplification of tad, docfile:Tomatoes* returns 23 documents in Luke, but 0 in my application. Per Gene, I changed from MUST to Should :

  QueryParser parser = new QueryParser("docurl", new StandardAnalyzer()); BooleanQuery bquery = new BooleanQuery(); Query parsedQuery = parser.Parse(query); bquery.Add(parsedQuery, Lucene.Net.Search.BooleanClause.Occur.SHOULD); int _max = searcher.MaxDoc(); BooleanQuery.SetMaxClauseCount(Int32.MaxValue); TopDocs hits = searcher.Search(parsedQuery, _max);

parsedQuery is just docfile:Tomatoes*

Edit2:

I think I finally got the root problem:

  QueryParser parser = new QueryParser("docurl", new StandardAnalyzer()); Query parsedQuery = parser.Parse(query);

The second line of the query is "docfile:Tomatoes*" , but parsedQuery is {docfile:tomatoes*} . Pay attention to the difference? Lower case 't' in the parsed request. I have not noticed this before. If I change the value in the IDE to "T", 23 results will be returned.

I checked that StandardAnalyzer used when indexing and reading an index. How to get QueryParser to save query value?

Edit3: Wow, how frustrating. According to the documentation, I can do the following:

parser.setLowercaseExpandedTerms (false);

Will the conditions of the pattern, prefix, fuzzy and range queries be automatically cropped or not. The default value is true.

I will not argue about whether this is a reasonable default or not. I believe that SimpleAnalyzer should have been used to inject everything into and out of the index. The depressing part, at least with the version I'm using, Luke uses a different path by default! At least I found out a little more about Lucene.

+4

c # lucene lucene.net luke

James May 13, '11 at 19:27

source share

2 answers

Using Occur.MUST equivalent to using the + operator with a standard query analyzer. So your code evaluates +docurl:www.siteurl.com +docfile:Tomatoes* , not the expression you entered in Luke. To get this behavior, try Occur.SHOULD when adding your suggestions.

+3

Gene golovchinsky May 15, '11 at 6:00

source share

Ryan ische · Accepted Answer · 2011-05-13T20:33:46+0000

QueryParser will really take a query such as "docurl: www.siteurl.com docfile: Tomatoes *" and build the correct query from it (logical query, range query, etc.) depending on the given query (see query syntax ).

The first step is to attach the debugger and check the value and type of parsedQuery .

Luke Lucene BooleanQuery

More articles: