I am using Lucene Highlighter 2.4.1 for my application. I use a marker to get the best matching fragments and display them. I am making a call to the String [] function getFragmentsWithHighlightedTerms (Analyzer Analyzer, Request Request, String fieldName, String fieldContents, int fragmentsNumber, int fragmentSize). For example:
String text = doc.get("MetaData"); getFragmentsWithHighlightedTerms(analyzer, query, "MetaData", Text, 5, 100);
The getFragmentsWithHighlightedTerms () function is defined as follows
private static String[] getFragmentsWithHighlightedTerms( argument list here) { TokenStream stream = TokenSources.getTokenStream(fieldName, fieldContents, analyzer); SpanScorer scorer = new SpanScorer(query, fieldName, new CachingTokenFilter(stream)); Fragmenter fragmenter = new SimpleSpanFragmenter(scorer, fragmentSize); Highlighter highlighter = new Highlighter(scorer); highlighter.setTextFragmenter(fragmenter); highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE); String[] fragments = highlighter.getBestFragments(stream, fieldContents, fragmentNumber); return fragments; }
Now my problem is that the highlighter.getBestFragments () method returns duplicates. That is, if I show the first 5 fragments, no. 1 and 3 are the same. I do not quite understand what causes this. Is there a problem with the code?
duplicates lucene
Mayank shrivastava
source share