Safe + efficient way to modify Mongo objects when cursor repeats?

I have code that checks every object in the Mongo collection (iterates over the result of find () with no parameters) and makes changes to some of them. This seems to be unsafe: my changes are saved, but then when I continue iterating with the cursor, a subset of the changed objects (10-15%) appears a second time. I did not change the document id or anything that has an index there.

I believe that I could avoid this problem by capturing all the document identifiers ahead of time (convert the cursor to an array), but these are large collections, so I really would like to avoid this.

I noticed that the default find () result does not seem to have a specific order, so I tried to put an explicit view in the cursor, {"_id": 1}. This seems to fix the problem - now nothing appears twice no matter what I change. But I do not know if this is a good approach. As far as I can tell from the documentation, adding sorting does not make it a preliminary request for all identifiers; if so, it’s good, but then I don’t know why this will fix the problem.

Is it just a bad idea to use cursors when changing material?

I use Scala / Casbah if that matters.

+7
source share
2 answers

It looks like you want this snapshot request. Here's more info on how to do this:

http://www.mongodb.org/display/DOCS/How+to+do+Snapshotted+Queries+in+the+Mongo+Database

+8
source

Consider using the update command, which modifies several documents: http://docs.mongodb.org/manual/tutorial/modify-documents/

Also, since you are only modifying some objects, consider using a query that returns only those documents that you intend to modify, rather than scanning the entire collection.

Iterating over the results of find and modifying objects may seem more convenient and flexible, since you are not limited by what you can do with update statements, and you can write code in your choice language to modify the document, however there is a problem that you described. as well as other restrictions:

http://docs.mongodb.org/manual/faq/developers/#faq-developers-isolate-cursors

For example, snapshot requests are not 100% safe and cannot be used with a private collection, so if you decide to outline later, your decision will break.

If you need to modify a very large number of objects in a more complex way, perhaps a map reduction or an aggregation pipeline might be the way to solve your problem:

http://docs.mongodb.org/manual/core/aggregation-pipeline/

http://docs.mongodb.org/manual/core/map-reduce/

0
source

All Articles