I am trying to use a filter shinglewith a filter synonym(see code below). This gives me the result:
enforced
for
for exam
exam testing
Words enforcedand implementedoccur together in the same way as the testingand examination. Is it possible to get the following conclusion?
for
for
for exam
for testing
Json definition
String json = jsonBuilder()
.startObject()
.field("number_of_shards", 1)
.startObject("analysis")
.startObject("filter")
.startObject("my_shingle_filter")
.field("type","shingle")
.field("min_shingle_size",2)
.field("max_shingle_size",2)
.field("output_unigrams",false)
.endObject()
.startObject("my_syn_filter")
.field("type", "synonym")
.field("format","wordnet")
.field("synonyms_path","prolog/wn_s.pl")
.endObject()
.endObject()
.startObject("analyzer")
.startObject("my_shingle_analyzer")
.field("type", "custom")
.field("tokenizer","standard")
.field("filter",new String[]{"lowercase","my_syn_filter","my_shingle_filter"})
.endObject()
.endObject()
.endObject()
.endObject().string();
client.admin().indices().prepareCreate("testshingle").setSettings(ImmutableSettings.settingsBuilder()
.loadFromSource(json))
.execute().actionGet();
AnalyzeResponse resp= client.admin().indices().prepareAnalyze("testshingle", "implemented for testing").setAnalyzer("my_shingle_analyzer").execute().get();
for(AnalyzeToken token:resp.getTokens()){
System.out.println(token.getTerm());
}
source
share