Duplicate key error while restarting multiple processes (Mongo> = 3.0.4 WiredTiger)

everything

I just got a weird error sent from our application:

when I updated using two processes, he complained about a duplicate key error in the collection with a unique index on it, but the operation in question was upsert.

case code:

import time from bson import Binary from pymongo import MongoClient, DESCENDING bucket = MongoClient('127.0.0.1', 27017)['test']['foo'] bucket.drop() bucket.update({'timestamp': 0}, {'$addToSet': {'_exists_caps': 'cap15'}}, upsert=True, safe=True, w=1, wtimeout=10) bucket.create_index([('timestamp', DESCENDING)], unique=True) while True: timestamp = str(int(1000000 * time.time())) bucket.update({'timestamp': timestamp}, {'$addToSet': {'_exists_foos': 'fooxxxxx'}}, upsert=True, safe=True, w=1, wtimeout=10) 

When I run the script with two processes, the exception is Pymongo:

 Traceback (most recent call last): File "test_mongo_update.py", line 11, in <module> bucket.update({'timestamp': timestamp}, {'$addToSet': {'_exists_foos': 'fooxxxxx'}}, upsert=True, safe=True, w=1, wtimeout=10) File "build/bdist.linux-x86_64/egg/pymongo/collection.py", line 552, in update File "build/bdist.linux-x86_64/egg/pymongo/helpers.py", line 202, in _check_write_command_response pymongo.errors.DuplicateKeyError: E11000 duplicate key error collection: test.foo index: timestamp_-1 dup key: { : "1439374020348044" } 

Env:

  • mongodb 3.0.5, WiredTiger

  • single copy mongodb

  • pymongo 2.8.1

mongo.conf

 systemLog: destination: file logAppend: true logRotate: reopen path: /opt/lib/log/mongod.log # Where and how to store data. storage: dbPath: /opt/lib/mongo journal: enabled: true engine: "wiredTiger" directoryPerDB: true # how the process runs processManagement: fork: true # fork and run in background pidFilePath: /opt/lib/mongo/mongod.pid # network interfaces net: port: 27017 bindIp: 0.0.0.0 # Listen to local interface only, comment to listen on all interfaces. setParameter: enableLocalhostAuthBypass: false 

Any thoughts on what might be wrong here?

PS:

I repeated the same case in the MMAPV1 storage engine, it works fine, why?

I found something related here: https://jira.mongodb.org/browse/SERVER-18213

but after fixing this error, he accepts this error, so it looks like this error is not completely fixed.

Greetings

+6
source share
3 answers

I found an error: https://jira.mongodb.org/browse/SERVER-14322

Please feel free to vote for him and watch him for further updates.

+5
source

In upsert, it is performed as checking an existing document for updating, or inserting a new document.

My best guess: you encounter a synchronization problem when:

  • Process 2 checks for an existence that does not perform
  • Process 1 checks for an existence that does not perform
  • Process 2 tabs that work
  • A process insert 1 that invokes a cheat key.

Check which source request your python library sends under the first one. Confirm what you expect from the home side of the mongo. Then, if you can play this floor regularly on wiredtiger, but never on mmap, raise a mongo bug to confirm what their expected behavior is. Sometimes it's hard to choose what they guarantee to be atomic.

This is a good example of why Mongo ObjectIDs combine timestamp, machine ID, pid, and counter for uniqueness.

+3
source

http://docs.mongodb.org/manual/core/storage/

With WiredTiger, all write operations are performed in the context of document-level locking. As a result, multiple clients can simultaneously modify multiple documents in the same collection.

Your multiple customers can update the collection at the same time. Wiredtiger will lock the document you are updating, not the collection.

+1
source

All Articles