I have a collection of tweets. I want to insert a list of tweets into this collection. There may be some duplicate tweets in the new list, and I want duplicate tweets not to be recorded, but everything else. For this, I use the following code.
mongoPayload = <list of tweets>
committedTweetIDs = db.tweets.insert(mongoPayload, w=1, continue_on_error=True)
print "%d documents committed" % len(committedTweetIDs)
The above code snippet should work. However, the behavior I get is that the second line threw a DuplicateKeyError. I do not know what has been going on since I mentioned continue_on_error.
In the end, I want Mongo to transfer all non-dual documents and return to me (as confirmation) the tweetID of all documents written to the journal.
source
share