Get Agreed Lucene Request Terms

Given a Lucene search query, such as: +(letter:A letter:B letter:C) +(style:Capital) , how can I determine which of the three letters really matches any document? I don't care where they match, or how many times they match, I just need to know if they match.

The goal is to take the original query ("ABC"), remove terms that match successfully (A and B), and then do further processing in remainder (C).

+1
source share
3 answers

Although the sample is in C #, the Lucene APIs are very similar (some differences are upper / lower case). I don't think it would be difficult to translate to java.

This use

 List<Term> terms = new List<Term>(); //will be filled with non-matched terms List<Term> hitTerms = new List<Term>(); //will be filled with matched terms GetHitTerms(query, searcher,docId, hitTerms,terms); 

And here is the method

 void GetHitTerms(Query query,IndexSearcher searcher,int docId,List<Term> hitTerms,List<Term>rest) { if (query is TermQuery) { if (searcher.Explain(query, docId).IsMatch() == true) hitTerms.Add((query as TermQuery).GetTerm()); else rest.Add((query as TermQuery).GetTerm()); return; } if (query is BooleanQuery) { BooleanClause[] clauses = (query as BooleanQuery).GetClauses(); if (clauses == null) return; foreach (BooleanClause bc in clauses) { GetHitTerms(bc.GetQuery(), searcher, docId,hitTerms,rest); } return; } if (query is MultiTermQuery) { if (!(query is FuzzyQuery)) //FuzzQuery doesn't support SetRewriteMethod (query as MultiTermQuery).SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE); GetHitTerms(query.Rewrite(searcher.GetIndexReader()), searcher, docId,hitTerms,rest); } } 
+8
source

You can use a cached filter for each of the individual terms and quickly check each doc identifier for their BitSets .

0
source

As an answer given by @LB, here is the converted JAVA code that works for me:

 void GetHitTerms(Query query,IndexSearcher searcher,int docId,List<Term> hitTerms,List<Term>rest) throws IOException { if(query instanceof TermQuery ) { if (searcher.explain(query, docId).isMatch() == true) hitTerms.add(((TermQuery) query).getTerm()); else rest.add(((TermQuery) query).getTerm()); return; } if(query instanceof BooleanQuery ) { for (BooleanClause clause : (BooleanQuery)query) { GetHitTerms(clause.getQuery(), searcher, docId,hitTerms,rest); } return; } if (query instanceof MultiTermQuery) { if (!(query instanceof FuzzyQuery)) //FuzzQuery doesn't support SetRewriteMethod ((MultiTermQuery)query).setRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE); GetHitTerms(query.rewrite(searcher.getIndexReader()), searcher, docId,hitTerms,rest); } } 
0
source

All Articles