Problems creating an index using a custom parser using Jest

Jest provides a brilliant asynchronous API for elasticsearch, we find this very useful. However, sometimes it turns out that the received requests are slightly different from the expected ones.

Usually we didn’t care, because everything works fine, but in this case it’s not.

I want to create an index using a specialized ngram analyzer. When I do this, following the API API docs for elasticsearch residues, I call below:

curl -XPUT 'localhost:9200/test' --data '
{
  "settings": {
    "number_of_shards": 3,
    "analysis": {
      "filter": {
        "keyword_search": {
          "type":     "edge_ngram",
          "min_gram": 3,
          "max_gram": 15
        }
      },
      "analyzer": {
        "keyword": {
          "type":      "custom",
          "tokenizer": "whitespace",
          "filter": [
            "lowercase",
            "keyword_search"
          ]
        }
      }
    }
  }
}'

and then I confirm that the analyzer is configured correctly using:

curl -XGET 'localhost:9200/test/_analyze?analyzer=keyword&text=Expecting many tokens

In response, I get several tokens, such as exp, expe, expec, etc.

, Jest client, json , PUT . Jest, :

new CreateIndex.Builder(name)
            .settings(
                    ImmutableSettings.builder()
                            .loadFromClasspath(
                                    "settings.json"
                            ).build().getAsMap()
            ).build();

  • Primo - tcpdump , elasticsearch ( ):

    {
      "settings.analysis.filter.keyword_search.max_gram": "15",
      "settings.analysis.filter.keyword_search.min_gram": "3",
      "settings.analysis.analyzer.keyword.tokenizer": "whitespace",
      "settings.analysis.filter.keyword_search.type": "edge_ngram",
      "settings.number_of_shards": "3",
      "settings.analysis.analyzer.keyword.filter.0": "lowercase",
      "settings.analysis.analyzer.keyword.filter.1": "keyword_search",
      "settings.analysis.analyzer.keyword.type": "custom"
    }
    
  • Secundo - :

    {
      "test": {
        "settings": {
          "index": {
            "settings": {
              "analysis": {
                "filter": {
                  "keyword_search": {
                    "type": "edge_ngram",
                    "min_gram": "3",
                    "max_gram": "15"
                  }
                },
                "analyzer": {
                  "keyword": {
                    "filter": [
                      "lowercase",
                      "keyword_search"
                    ],
                    "type": "custom",
                    "tokenizer": "whitespace"
                  }
                }
              },
              "number_of_shards": "3"   <-- the only difference from the one created with rest call
            },
            "number_of_shards": "3",
            "number_of_replicas": "0",
            "version": {"created": "1030499"},
            "uuid": "Glqf6FMuTWG5EH2jarVRWA"
          }
        }
      }
    }
    
  • Tertio - curl -XGET 'localhost:9200/test/_analyze?analyzer=keyword&text=Expecting many tokens !

1. , Jest json, ?

2. , Jest, ?

+4
1

, Jest , . .

1.. , Jest json, ?

Jest, Elasticsearch ImmutableSettings , :

    Map test = ImmutableSettings.builder()
            .loadFromSource("{\n" +
                    "  \"settings\": {\n" +
                    "    \"number_of_shards\": 3,\n" +
                    "    \"analysis\": {\n" +
                    "      \"filter\": {\n" +
                    "        \"keyword_search\": {\n" +
                    "          \"type\":     \"edge_ngram\",\n" +
                    "          \"min_gram\": 3,\n" +
                    "          \"max_gram\": 15\n" +
                    "        }\n" +
                    "      },\n" +
                    "      \"analyzer\": {\n" +
                    "        \"keyword\": {\n" +
                    "          \"type\":      \"custom\",\n" +
                    "          \"tokenizer\": \"whitespace\",\n" +
                    "          \"filter\": [\n" +
                    "            \"lowercase\",\n" +
                    "            \"keyword_search\"\n" +
                    "          ]\n" +
                    "        }\n" +
                    "      }\n" +
                    "    }\n" +
                    "  }\n" +
                    "}").build().getAsMap();
    System.out.println("test = " + test);

:

test = {
    settings.analysis.filter.keyword_search.type=edge_ngram,
    settings.number_of_shards=3,
    settings.analysis.analyzer.keyword.filter.0=lowercase,
    settings.analysis.analyzer.keyword.filter.1=keyword_search,
    settings.analysis.analyzer.keyword.type=custom,
    settings.analysis.analyzer.keyword.tokenizer=whitespace,
    settings.analysis.filter.keyword_search.max_gram=15,
    settings.analysis.filter.keyword_search.min_gram=3
}

2. , Jest, ?

JSON/map . , ( , ):

    @Test
    public void createIndexTemp() throws IOException {
        String index = "so_q_26949195";

        String settingsAsString = "{\n" +
                "  \"settings\": {\n" +
                "    \"number_of_shards\": 3,\n" +
                "    \"analysis\": {\n" +
                "      \"filter\": {\n" +
                "        \"keyword_search\": {\n" +
                "          \"type\":     \"edge_ngram\",\n" +
                "          \"min_gram\": 3,\n" +
                "          \"max_gram\": 15\n" +
                "        }\n" +
                "      },\n" +
                "      \"analyzer\": {\n" +
                "        \"keyword\": {\n" +
                "          \"type\":      \"custom\",\n" +
                "          \"tokenizer\": \"whitespace\",\n" +
                "          \"filter\": [\n" +
                "            \"lowercase\",\n" +
                "            \"keyword_search\"\n" +
                "          ]\n" +
                "        }\n" +
                "      }\n" +
                "    }\n" +
                "  }\n" +
                "}";
        Map settingsAsMap = ImmutableSettings.builder()
                .loadFromSource(settingsAsString).build().getAsMap();

        CreateIndex createIndex = new CreateIndex.Builder(index)
                .settings(settingsAsString)
                .build();

        JestResult result = client.execute(createIndex);
        assertTrue(result.getErrorMessage(), result.isSucceeded());

        GetSettings getSettings = new GetSettings.Builder().addIndex(index).build();
        result = client.execute(getSettings);
        assertTrue(result.getErrorMessage(), result.isSucceeded());
        System.out.println("SETTINGS SENT AS STRING settingsResponse = " + result.getJsonString());

        Analyze analyze = new Analyze.Builder()
                .index(index)
                .analyzer("keyword")
                .source("Expecting many tokens")
                .build();
        result = client.execute(analyze);
        assertTrue(result.getErrorMessage(), result.isSucceeded());
        Integer actualTokens = result.getJsonObject().getAsJsonArray("tokens").size();
        assertTrue("Expected multiple tokens but got " + actualTokens, actualTokens > 1);

        analyze = new Analyze.Builder()
                .analyzer("keyword")
                .source("Expecting single token")
                .build();
        result = client.execute(analyze);
        assertTrue(result.getErrorMessage(), result.isSucceeded());
        actualTokens = result.getJsonObject().getAsJsonArray("tokens").size();
        assertTrue("Expected single token but got " + actualTokens, actualTokens == 1);

        admin().indices().delete(new DeleteIndexRequest(index)).actionGet();

        createIndex = new CreateIndex.Builder(index)
                .settings(settingsAsMap)
                .build();

        result = client.execute(createIndex);
        assertTrue(result.getErrorMessage(), result.isSucceeded());

        getSettings = new GetSettings.Builder().addIndex(index).build();
        result = client.execute(getSettings);
        assertTrue(result.getErrorMessage(), result.isSucceeded());
        System.out.println("SETTINGS AS MAP settingsResponse = " + result.getJsonString());

        analyze = new Analyze.Builder()
                .index(index)
                .analyzer("keyword")
                .source("Expecting many tokens")
                .build();
        result = client.execute(analyze);
        assertTrue(result.getErrorMessage(), result.isSucceeded());
        actualTokens = result.getJsonObject().getAsJsonArray("tokens").size();
        assertTrue("Expected multiple tokens but got " + actualTokens, actualTokens > 1);
    }

, , , settingsAsMap , (settings settings, JSON, ), .

?

, Elasticearch . ( ImmutableSettings), settings, , ( settingsAsString).

TL;DR:

JSON "" ( ImmutableSettings).

+8

All Articles