PhraseQuery Doesn't Work Lucene 4.5.0

Question

PhraseQuery Doesn't Work Lucene 4.5.0

I tried to work with PhraseQuery, but could not get hits from the search. I use Lucene 4.5.0.

My index code

private IndexWriter writer;

public LuceneIndexSF(final String indexDir) throws IOException {
    Analyzer analyzer = new KeywordAnalyzer();
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_45,
            analyzer);
    Directory directory = FSDirectory.open(new File(indexDir));
    writer = new IndexWriter(directory, config);
}
private Document getDocument(File f, String line, int lineNum)
            throws IOException {
        Document doc = new Document();
        Field field = null;
        if (line != null && line.split(DELIMITER).length >= 5) {
            String[] lineValues = line.split(DELIMITER);
            field = new Field("name", line.split("\t")[1],
                    TextField.TYPE_STORED);
            doc.add(field);
            if (lineValues[2] != null && !lineValues[2].trim().isEmpty()) {
                field = new Field("ref", lineValues[2], TextField.TYPE_STORED);
                doc.add(field);
            }
            field = new Field("type", lineValues[3], TextField.TYPE_STORED);
            doc.add(field);
            field = new LongField("code", Long.parseLong(lineValues[4]),
                    LongField.TYPE_STORED);
            doc.add(field);
            if (lineValues.length == 7 && lineValues[5] != null
                    && !lineValues[5].trim().isEmpty()) {
                field = new Field("alias1", lineValues[5],
                        TextField.TYPE_STORED);
                doc.add(field);
            }
            if (lineValues.length == 7 && lineValues[6] != null
                    && !lineValues[6].trim().isEmpty()) {
                field = new Field("alias2", lineValues[6],
                        TextField.TYPE_STORED);
                doc.add(field);
            }
        }
        field = new IntField("linenum", lineNum, IntField.TYPE_STORED);
        doc.add(field);
        return doc;
    }
.... and other code where i add document in writer using writer.addDocument(doc);

My search code

private static void search(String indexDir, String quer) throws IOException,
        ParseException {
    IndexReader inxRead = DirectoryReader.open(FSDirectory.open(new File(
            indexDir)));
    IndexSearcher is = new IndexSearcher(inxRead);
    String[] termArr = quer.split(" ");
    PhraseQuery phraseQuery= new PhraseQuery();
    for(int inx = 0; inx < termArr.length; inx++){
        phraseQuery.add(new Term("name", termArr[inx]));
    }
    phraseQuery.setSlop(4);
    long start = System.currentTimeMillis();
    TopDocs hits = is.search(phraseQuery, 1000);
    long end = System.currentTimeMillis();
    System.err.println("Parser> Found " + hits.totalHits
            + " document(s) (in " + (end - start)
            + " milliseconds) that matched query '" + multiQuery + "':");
    for (ScoreDoc scoreDoc : hits.scoreDocs) {
        Document doc = is.doc(scoreDoc.doc);
        System.out.println("Parser> " + scoreDoc.score + " :: "
                + doc.get("type") + " - " + doc.get("code") + " - "
                + doc.get("name") + ", " + doc.get("linenum"));
    }
    inxRead.close();
}

Please tell me if I am wrong.

Edit

also tried using a standard analyzer still not results

Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_45);

Decision

According to Arun's answer PhraseQuery, Analyzer is required for proper operation, which Tokenize each word in Document Fieldfor my case I used LowerCaseFilterto make all the queries in lower case so that it can work without case sensitivity. And used EdgeNGramTokenFilterfor the purpose of automatic completion.

public LuceneIndexSF(final String indexDir) throws IOException {
Analyzer analyzer = new Analyzer() {
        @Override
        protected TokenStreamComponents createComponents(String fieldName,
                java.io.Reader reader) {
            Tokenizer source = new StandardTokenizer(Version.LUCENE_45,
                    reader);
            TokenStream result = new StandardFilter(Version.LUCENE_45,
                    source);
            result = new LowerCaseFilter(Version.LUCENE_45, result);
            result = new EdgeNGramTokenFilter(Version.LUCENE_45, result, 1,
                    20);
            return new TokenStreamComponents(source, result);
        }
    };
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_45,
        analyzer);
Directory directory = FSDirectory.open(new File(indexDir));
writer = new IndexWriter(directory, config);
}

My final search method

private static void search(String indexDir, String quer) throws IOException,
    ParseException {
IndexReader inxRead = DirectoryReader.open(FSDirectory.open(new File(
        indexDir)));
IndexSearcher is = new IndexSearcher(inxRead);
String[] termArr = quer.split(" ");
PhraseQuery query1 = new PhraseQuery();
PhraseQuery query2 = new PhraseQuery();
PhraseQuery query3 = new PhraseQuery();
for (int inx = 0; inx < termArr.length; inx++) {
    query1.add(new Term(SchoolFinderConstant.ENTITY_NAME,termArr[inx]),inx);
    query2.add(new Term(SchoolFinderConstant.ENTITY_ALIAS1,termArr[inx]),inx);
    query3.add(new Term(SchoolFinderConstant.ENTITY_ALIAS2,termArr[inx]),inx);
}
BooleanQuery mainQuery = new BooleanQuery();
mainQuery.add(query1, Occur.SHOULD);
mainQuery.add(query2, Occur.SHOULD);
mainQuery.add(query3, Occur.SHOULD);
long start = System.currentTimeMillis();
TopDocs hits = is.search(mainQuery, 1000);
long end = System.currentTimeMillis();
System.err.println("Parser> Found " + hits.totalHits
        + " document(s) (in " + (end - start)
        + " milliseconds) that matched query '" + multiQuery + "':");
for (ScoreDoc scoreDoc : hits.scoreDocs) {
    Document doc = is.doc(scoreDoc.doc);
    System.out.println("Parser> " + scoreDoc.score + " :: "
            + doc.get("type") + " - " + doc.get("code") + " - "
            + doc.get("name") + ", " + doc.get("linenum"));
}
inxRead.close();
}

0

java lucene

Yogesh Dec 29 '13 at 15:47

source share

1 answer

Arun · Accepted Answer · 2013-12-30T20:22:10+0000

KeywordAnalyzer, , , KeywordAnalyzer "" . , , . http://lucene.apache.org/core/4_5_0/analyzers-common/org/apache/lucene/analysis/core/KeywordAnalyzer.html, - .

WhitespaceAnalyzer, PhraseQuery. . , .

MultiFieldQueryParser , , . , , .

PhraseQuery Doesn't Work Lucene 4.5.0

More articles: