How to check duplicate data on ElasticSearch?

When storing some documents, it should store nonexistent ones and ignore the rest (should this be done at the application level, perhaps checking for a document identifier, etc.?)

+5
search elasticsearch deduplication
source share
1 answer

Here is what the documentation says:

Type of transaction

The index operation also accepts the op_type parameter, which can be used to force the create operation, allowing "put-if-absent" behavior. When create is used, the index operation will fail if a document by this identifier already exists in the index.

Here is an example using the op_type parameter:

$ curl -XPUT 'http://localhost:9200/twitter/tweet/1?op_type=create' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }' 

Another way to tell create is to use the following uri:

 $ curl -XPUT 'http://localhost:9200/twitter/tweet/1/_create' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }' 
+8
source share

All Articles