How to check duplicate data on ElasticSearch?

Question

How to check duplicate data on ElasticSearch?

When storing some documents, it should store nonexistent ones and ignore the rest (should this be done at the application level, perhaps checking for a document identifier, etc.?)

+5

search elasticsearch deduplication

Matías insaurralde Jan 13 '13 at 3:57

source share

1 answer

dadoonet · Accepted Answer · 2013-01-13T04:20:58+0000

Here is what the documentation says:

Type of transaction

The index operation also accepts the op_type parameter, which can be used to force the create operation, allowing "put-if-absent" behavior. When create is used, the index operation will fail if a document by this identifier already exists in the index.

Here is an example using the op_type parameter:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1?op_type=create' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }'

Another way to tell create is to use the following uri:

 $ curl -XPUT 'http://localhost:9200/twitter/tweet/1/_create' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }'

How to check duplicate data on ElasticSearch?

More articles: