Cannot search all generated terms in lucene index

I index and scan the code using my own parser. This text “will work in wi-fi format” after tokens are generated (“will” as a stop word, will be eliminated).

wi-fi {position:2 start:5 end:10}
wifi {position:2 start:5 end:10}
wi {position:2 start:5 end:7}
fi {position:2 start:8 end:10}
work {position:3 start:11 end:15}

When I search for wifi conditions, I get search results. However, when I issue any request (phrase / non-phrase) for wifi, wi, fi, I do not get any results. Is there something wrong with the generated tokens?

Designed search queries:

For wifi (works great)

Lucene's: +matchAllDocs:true +(alltext:wi-fi alltext:wifi alltext:wi alltext:fi)

For wifi (no results returned)

Lucene's: +matchAllDocs:true +alltext:wifi

For "wi-fi will work" (works great)

Lucene's: +matchAllDocs:true +alltext:"(wi-fi wifi wi fi) work"

For "wifi will work" (results are not returned)

Lucene's: +matchAllDocs:true +alltext:"? wifi work"

UPDATE

Problem detected:

public boolean incrementToken() throws IOException
{
    /*
     * first return all tokens in the list
     */
    if (tokens.size() > 0)
    {
        Token top = tokens.removeFirst();
        restoreState(current);
        **termAtt.setEmpty().append(new String(top.buffer(), 0, top.length()));**
        offsetAtt.setOffset(top.startOffset(), top.endOffset());
        posIncrAtt.setPositionIncrement(0);
        return true;
    }

    /*
     * if there are no more incoming tokens return false
     */
    if (!input.incrementToken())
        return false;

    Token wrapper = new Token();
    wrapper.copyBuffer(termAtt.buffer(), 0, termAtt.length());
    wrapper.setStartOffset(offsetAtt.startOffset());
    wrapper.setEndOffset(offsetAtt.endOffset());
    wrapper.setPositionIncrement(posIncrAtt.getPositionIncrement());

    normalizeHyphens(wrapper);
    current = captureState();
    return true;
}

In the bold line above I said

termAtt.setEmpty().append(new String(top.buffer()));

wi, , wi * . , top.buffer() , .

: (

+5
1

, .

  • , wi, fi, , . , - ,
  • / . , .
  • / , .
  • , .
+1
source

All Articles